Kubernetes Resource Requests and Limits

Container-level CPU and memory controls that let the scheduler fit Pods onto nodes and protect nodes from runaway workloads. Synthesized from CKA Day 16 - Kubernetes Requests and Limits.

What Are Requests and Limits?

Kubernetes schedules Pods by evaluating many filters: node health, taints, tolerations, node affinity, selectors, and available resources. Resource requests and limits are the CPU/memory side of that decision.

FieldMeaningWhen It Matters
resources.requests.cpuCPU capacity reserved for schedulingBefore the Pod is placed
resources.requests.memoryMemory capacity reserved for schedulingBefore the Pod is placed
resources.limits.cpuMaximum CPU the container may consumeWhile the container runs
resources.limits.memoryMaximum memory the container may consumeWhile the container runs

Request = scheduler promise. The kube-scheduler only places a Pod on a node if the node has enough remaining allocatable capacity for the Pod’s requests. Limit = runtime guardrail. If a container exceeds its memory limit, Kubernetes kills the container with OOMKilled rather than allowing it to exhaust the node. Source: CKA Day 16

YAML Anatomy

Resource settings are container fields because CPU and memory are consumed by containers, not by the Pod object itself:

apiVersion: v1
kind: Pod
metadata:
  name: memory-demo
  namespace: mem-example
spec:
  containers:
  - name: memory-demo-ctr
    image: polinux/stress
    resources:
      requests:
        memory: "100Mi"
        cpu: "250m"
      limits:
        memory: "200Mi"
        cpu: "500m"
    command: ["stress"]
    args: ["--vm", "1", "--vm-bytes", "150M", "--vm-hang", "1"]

CKA syntax memory: spec -> containers[] -> resources -> requests/limits -> cpu/memory.

Scheduler Behaviour

Requests participate in scheduling alongside the placement primitives from Manual Scheduling:

  1. Scheduler sees an unscheduled Pod.
  2. It filters nodes that cannot fit the Pod’s requested CPU/memory.
  3. It also filters by taints/tolerations, node selectors, node affinity, and node conditions.
  4. If at least one node fits, the scheduler binds the Pod.
  5. If no node fits, the Pod remains Pending and kubectl describe pod shows events such as Insufficient memory or Insufficient cpu.

This is why a Pod requesting 1000Gi memory stays Pending even if its command would only try to use 150M: scheduling uses declared requests, not future actual usage. Source: CKA Day 16

Runtime Behaviour

Limits govern what happens after the Pod starts:

Runtime ConditionResult
Usage stays between request and limitPod continues running
Memory usage exceeds limitContainer is killed with OOMKilled
Request exceeds node allocatable capacityPod does not schedule; remains Pending
CPU demand exceeds CPU limitCPU is throttled rather than immediately killed

The lesson’s memory stress demo uses polinux/stress to show the difference between running within the limit, exceeding the limit, and requesting impossible capacity. The key operational idea is blast-radius control: prefer killing one over-consuming Pod to letting it exhaust the whole node. Source: CKA Day 16

Metrics Server and kubectl top

Metrics Server exposes CPU and memory usage for nodes and Pods. The lesson installs a Metrics Server manifest, verifies the Pod in the kube-system Namespace, and then uses:

kubectl top node
kubectl top pod memory-demo -n mem-example

Metrics Server is also the data source for autoscaling flows such as HPA and VPA, which the course treats as later topics. For Day 16, the immediate value is visibility: you can verify whether a stress-test Pod is consuming the memory you expected. Source: CKA Day 16

Namespace Governance Connection

Requests and limits become more powerful when combined with Namespace-level policy:

  • ResourceQuota caps aggregate requested and limited CPU/memory for a Namespace.
  • LimitRange can define default requests/limits so users cannot create unconstrained Pods by accident.
  • A demo Namespace like mem-example isolates stress tests from other workloads.

In production, this is how platform teams prevent one team, app, or environment from consuming the whole shared cluster.

Troubleshooting Matrix

SymptomLikely CauseCommandFix
Pod stuck PendingRequest cannot fit on any nodekubectl describe pod <pod>Lower requests or add capacity
Event says Insufficient memoryrequests.memory exceeds available allocatable memorykubectl describe pod <pod>Reduce request or schedule to larger node
Container repeatedly restarts with OOMKilledActual memory usage exceeds limits.memorykubectl describe pod <pod>Fix leak, reduce load, or raise memory limit
kubectl top has no dataMetrics Server not installed or not readykubectl get pods -n kube-systemInstall/fix Metrics Server
Node pressure after workload deployMissing/too-high limits allow runaway consumptionkubectl top nodeAdd limits and validate workload profile

CKA Exam Speed Patterns

# Create namespace for resource demos
kubectl create ns mem-example
 
# Apply a Pod with resource settings
kubectl apply -f mem-request.yaml
 
# Inspect scheduling failures and OOMKilled states
kubectl describe pod <pod> -n mem-example
 
# Observe live resource usage
kubectl top node
kubectl top pod <pod> -n mem-example
 
# Generate a Pod manifest quickly, then add resources manually
kubectl run stress --image=polinux/stress --restart=Never \
  --dry-run=client -o yaml > pod.yaml

Tags: kubernetes resource-requests resource-limits metrics-server scheduling cka devops troubleshooting