CKA Day 16 - Kubernetes Requests and Limits

Day 16 of the 40-day CKA Certification course by Tech Tutorials with Piyush. This lesson explains how Kubernetes uses container resource requests and limits to make scheduling decisions, protect nodes from runaway workloads, and expose memory/CPU behaviour during troubleshooting.

Core Concepts

Resource requests and limits live under each container’s resources block. A request is the amount of CPU or memory the scheduler treats as reserved when deciding whether a Pod can fit on a node. A limit is the maximum runtime amount a container is allowed to consume. Requests answer the placement question: “Can this Pod fit here?” Limits answer the safety question: “How much can this container consume before Kubernetes stops it?”

The lesson uses a two-node scheduling diagram to show why requests matter. If each node has finite CPU and memory, the scheduler places Pods only while requested resources fit alongside other scheduling filters like tolerations, node affinity, and selectors. Once no node has enough remaining allocatable capacity, a new Pod stays Pending with events such as Insufficient memory or Insufficient cpu.

The practical demo installs Metrics Server, then uses kubectl top node and kubectl top pod to observe CPU and memory usage. It creates a mem-example Namespace and runs polinux/stress Pods with different memory requests/limits. A Pod that uses memory within its configured bounds runs normally. A Pod that exceeds its memory limit is killed with OOMKilled. A Pod that requests more memory than any node can provide never schedules and remains Pending.

YAML Pattern

apiVersion: v1
kind: Pod
metadata:
  name: memory-demo
  namespace: mem-example
spec:
  containers:
  - name: memory-demo-ctr
    image: polinux/stress
    resources:
      requests:
        memory: "100Mi"
      limits:
        memory: "200Mi"
    command: ["stress"]
    args: ["--vm", "1", "--vm-bytes", "150M", "--vm-hang", "1"]

The important detail is the nesting: resources is a child of the container, not the Pod root. CPU and memory are container resources, so each container in a multi-container Pod can have separate requests and limits.

Demo Outcomes

ScenarioRequestLimitRuntime DemandResult
Healthy Pod100Mi200Mi150MPod runs; kubectl top pod reports usage
Exceeds limit50Mi100Mi250MContainer is killed with OOMKilled
Impossible request1000Gi1000Gi150MPod remains Pending with Insufficient memory

Commands Practised

# Install metrics server from course manifest
kubectl apply -f metricserver.yaml
 
# Inspect system add-on Pod
kubectl get pods -n kube-system
 
# Observe resource usage
kubectl top node
kubectl top pod memory-demo -n mem-example
 
# Create isolated demo namespace
kubectl create namespace mem-example
 
# Apply stress-test Pods
kubectl apply -f mem-request.yaml
kubectl apply -f mem2.yaml
kubectl apply -f mem3.yaml
 
# Debug failed or pending Pods
kubectl get pods -n mem-example
kubectl describe pod memory-demo-2 -n mem-example
kubectl describe pod memory-demo-3 -n mem-example

Key Insight

Requests and limits are both reliability controls, but they operate at different moments. Requests influence the scheduler before the Pod starts. Limits protect the node after the Pod starts. Without limits, a memory-leaking container can consume node memory and harm unrelated workloads. With a memory limit, Kubernetes fails the offending container instead, producing a smaller and more debuggable blast radius.

See Also

Wiki Concepts

Creator / Entity


Ingested on 2026-07-02. Part of the Consumed Videos library.