Vertical Pod Autoscaler (VPA)
The Kubernetes tool that analyzes historical and current resource usage to recommend or automatically adjust container CPU and memory requests/limits. VPA solves the opposite problem from HPA: instead of adding more Pods, it makes each Pod the right size. Synthesized from CKA Day 17 — Kubernetes Autoscaling Explained.
What Is VPA?
VPA is not part of the core Kubernetes distribution; it is an official addon maintained by the Kubernetes Autoscaling Special Interest Group (SIG). It consists of three components:
| Component | Role |
|---|---|
| Recommender | Monitors metrics and computes recommended requests/limits |
| Updater | Evicts Pods that need new resource values (in “Auto” or “Initial” mode) |
| Admission Plugin | Mutates new Pod specs to inject recommended resources at creation time |
CKA Note: VPA is conceptual knowledge for the exam. You should know what it does, its three modes, and why it conflicts with HPA. Detailed VPA installation and configuration are not exam topics. Source: CKA Day 17
VPA Modes
VPA operates in three modes that trade off safety vs automation:
| Mode | What Happens | Use Case |
|---|---|---|
| Off | Generates recommendations only; does not modify workloads | Safe starting point; review recommendations before applying |
| Initial | Applies recommendations only to newly created Pods | Low risk; existing Pods keep running with old values |
| Auto | Evicts running Pods and recreates them with updated resources | Full automation; may cause brief downtime during eviction |
Critical Warning: Do not run VPA in “Auto” mode on the same workload as HPA. Both controllers adjust capacity, which causes thrashing: HPA adds replicas, VPA reduces per-Pod resources, HPA removes replicas, VPA increases resources. Choose one primary autoscaler per workload. Source: CKA Day 17
How VPA Works
┌─────────────────────────────────────────────────────────────┐
│ VPA Architecture │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Recommender │────▶│ Updater │────▶│ Admission │ │
│ │ (metrics) │ │ (evictions) │ │ Controller │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ VPA Object │ │ New Pod specs │ │
│ │ (recommendation)│ │ (mutated) │ │
│ └─────────────────┘ └─────────────────┘ │
└─────────────────────────────────────────────────────────────┘
- Recommender reads historical metrics from the Metrics Server and time-series stores
- It produces a recommendation: “this container should request
200mCPU and256Mimemory” - The recommendation is stored in the VPA object’s status
- Updater (in Auto mode) evicts Pods that are far from the recommendation
- Admission Controller intercepts new Pod creation and patches
resources.requests/limitsbefore the Pod is scheduled
VPA YAML Anatomy
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: nginx-vpa
namespace: default
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-deploy
updatePolicy:
updateMode: "Off" # Off | Initial | Auto
resourcePolicy:
containerPolicies:
- containerName: nginx
minAllowed:
cpu: 50m
memory: 100Mi
maxAllowed:
cpu: 1000m
memory: 1Gi
controlledResources: ["cpu", "memory"]| Field | Description |
|---|---|
targetRef | The workload to analyze (Deployment, ReplicaSet, StatefulSet, DaemonSet) |
updatePolicy.updateMode | Off (recommend only), Initial (new Pods only), Auto (evict and recreate) |
resourcePolicy.containerPolicies | Bounds and controlled resources per container |
minAllowed / maxAllowed | Safety rails — VPA will not recommend outside this range |
controlledResources | Which resources VPA manages (cpu, memory, or both) |
VPA vs HPA: When to Choose Which
| Scenario | Best Tool | Reason |
|---|---|---|
| Stateless web service with traffic spikes | HPA | Fast to replicate; load balancer distributes traffic |
| Database or cache that cannot be replicated | VPA | Single replica; right-size instead of replicate |
| Over-provisioned Pods wasting cluster capacity | VPA (Off) | Analyze first, then apply recommendations manually |
| Microservice with predictable daily patterns | HPA | Match replica count to demand curve |
| Pod constantly OOMKilled despite not being under load | VPA | Memory limit is too low; VPA raises it |
| Need both scale-out and right-sizing | HPA + VPA (Off/Initial) | HPA handles traffic; VPA informs baseline sizing |
VPA and Resource Requests/Limits
VPA directly mutates the resources.requests and resources.limits fields that Day 16 teaches. This means:
- In “Off” mode: You manually apply VPA recommendations to your YAML manifests. This is the safest production workflow.
- In “Auto” mode: VPA changes are ephemeral (applied live, not saved to Git). This creates configuration drift. GitOps teams usually prefer “Off” mode.
A typical VPA workflow:
- Deploy workload with conservative requests
- Run VPA in “Off” mode for 24–48 hours
- Read recommendations from
kubectl describe vpa <name> - Update your Git-tracked manifests with the recommended values
- Re-deploy and disable VPA, or keep it in “Off” mode for continuous monitoring
Limitations and Gotchas
| Limitation | Explanation |
|---|---|
| Not core Kubernetes | Must be installed separately; not available on all managed clusters by default |
| Requires eviction | ”Auto” mode restarts Pods to apply new resource values |
| Conflicts with HPA | Both autoscalers on the same workload cause oscillation |
| Does not handle limits-only | VPA adjusts requests; if you set limits without requests, behavior is undefined |
| No instant reaction | Recommendations are based on historical averages, not real-time spikes |
| Limited to Pod resources | VPA does not scale nodes; use Cluster Autoscaler for that |
Sources
Related Pages
- Kubernetes Autoscaling — overview of all four mechanisms
- Horizontal Pod Autoscaler (HPA) — when to add replicas instead of resizing
- Kubernetes Resource Requests and Limits — the fields VPA adjusts
- Deployment, ReplicaSet & Replication Controller — workloads VPA targets
- Pod Fundamentals — the unit being resized
- Kubernetes Architecture — controller and admission controller context
- Kubernetes Namespaces — VPA objects are namespace-scoped
- Why Kubernetes? — autoscaling as a core orchestration benefit
- CKA Certification — exam domains
- CKA Study Roadmap — Day 17 in the 40-day plan
- Tech Tutorials with Piyush — course creator
Tags: kubernetes vpa vertical-pod-autoscaler autoscaling resource-optimization devops cka