CKA Day 18 — Kubernetes Health Probes Explained

Day 18 of the 40-day CKA course by Tech Tutorials with Piyush.

Core Synthesis

This lesson introduces the three Kubernetes health probe mechanisms — Liveness, Readiness, and Startup — that keep production workloads healthy by enabling the kubelet to make intelligent decisions about container lifecycle and traffic routing.

Why Probes Matter

Without probes, Kubernetes treats a container as “healthy” the moment its main process starts. In reality, an application may need seconds or minutes to warm up caches, connect to databases, or load configuration. Worse, a running process can enter a degraded state (infinite loop, deadlock, memory leak) without crashing — leaving it alive but useless. Probes solve both problems.

The Three Probe Types

ProbePurposeAction on Failure
LivenessIs the container alive? Should it be restarted?kubelet kills and restarts the container
ReadinessIs the container ready to accept traffic?kubelet removes Pod IP from Service Endpoints
StartupHas a slow-starting container finished initializing?Disables liveness/readiness checks until it succeeds, preventing premature restarts

Critical distinction: Liveness affects the container (restart it), Readiness affects the Service (stop sending traffic). Confusing the two is a common production incident pattern.

Probe Mechanisms (How to Check)

Kubernetes supports four ways to probe a container:

MechanismHow It WorksBest For
HTTP GETSends an HTTP request to a specific path/portWeb apps, REST APIs, health-check endpoints (/healthz)
TCP SocketAttempts to open a TCP connection to a portDatabases, caches, message queues, gRPC services
ExecRuns a command inside the container; exit code 0 = successCustom scripts, file-existence checks, complex validation
gRPCNative gRPC health-checking protocol (Kubernetes 1.27+, alpha)gRPC-first microservices

Key Parameters

Every probe shares these timing knobs:

ParameterDefaultPurpose
initialDelaySeconds0Wait this long after container start before first probe
periodSeconds10How often to run the probe
timeoutSeconds1How long to wait for a probe response before counting it as failed
successThreshold1Consecutive successes needed to mark healthy
failureThreshold3Consecutive failures needed to mark unhealthy and trigger action

Exam Trap: failureThreshold: 3 with periodSeconds: 10 means a container is declared unhealthy after 30 seconds of failures — not 10. Many candidates miss this on timing questions.

Liveness Probe in Practice

A liveness probe catches deadlocks and infinite loops that do not crash the process. Example: an HTTP server that responds on /healthz but has locked its main thread.

livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10
  failureThreshold: 3

When this probe fails 3 times in a row, the kubelet restarts the container. The Pod stays on the same node; only the container is recreated. This is self-healing at the container level.

Readiness Probe in Practice

A readiness probe ensures traffic only hits Pods that are truly ready. Example: an app that must connect to a database before serving requests.

readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

When readiness fails, the Pod’s IP is removed from the Service’s Endpoints object. Existing connections are not killed, but new requests stop routing to this Pod. Once readiness succeeds again, the IP is re-added automatically.

Startup Probe in Practice

The startup probe was added in Kubernetes 1.16 to protect slow-starting containers from aggressive liveness checks. Without it, a large Java or ML model container might get killed by liveness before it finishes initializing.

startupProbe:
  httpGet:
    path: /healthz
    port: 8080
  failureThreshold: 30
  periodSeconds: 10

This gives the container 5 minutes (30 × 10s) to start. While the startup probe is running, liveness and readiness probes are disabled. After startup succeeds, the other probes begin.

Integration with Services and Deployments

Probes do not exist in isolation. They interact with the broader Kubernetes control loop:

  1. kubelet runs the probes on each node
  2. kubelet reports Pod status to the API Server
  3. EndpointSlice controller watches Pod readiness and updates EndpointSlices
  4. kube-proxy reads EndpointSlices and programs iptables/ipvs rules
  5. Deployment controller counts available replicas using readiness; rolling updates wait for new Pods to become ready before terminating old ones

Common Production Patterns

PatternDescription
Separate endpointsUse /healthz for liveness and /ready for readiness. They often check different things (liveness = “not deadlocked”, readiness = “DB connected”).
Startup + LivenessAlways pair a startup probe with a liveness probe for slow-starting containers.
Exec for legacy appsWhen the app has no HTTP server, use an exec probe that checks a PID file or socket.
TCP for stateful servicesRedis, PostgreSQL, and Kafka often use TCP socket probes on their native ports.

CKA Exam Patterns

  • Imperative creation does not support probes easily — you will write YAML manifests
  • kubectl describe pod <name> shows probe failures under Events
  • If a Pod is Running but not receiving traffic, check readiness first
  • If a Pod restarts repeatedly, check liveness thresholds and initialDelaySeconds
  • kubectl get endpoints <svc> shows whether readiness is filtering Pods out of the Service

See Also

Wiki Concepts

Creator / Entity


Tags: cka kubernetes health-probes liveness readiness startup devops production