Kubernetes Taints and Tolerations

The negative scheduling primitive that repels Pods from unsuitable nodes. Taints are the “Do Not Disturb” signs on nodes; tolerations are the exemptions granted to Pods. Critical for node isolation, maintenance windows, control plane protection, and the CKA exam. Synthesized from CKA Day 14 — Taints and Tolerations in Kubernetes.

What Are Taints and Tolerations?

By default, the Kubernetes scheduler tries to spread Pods evenly across healthy nodes. But not every node is appropriate for every workload. You may want to:

  • Keep user workloads off control plane nodes
  • Reserve GPU nodes for ML training jobs
  • Drain a node for maintenance without manual Pod deletion
  • Isolate dedicated hardware (SSD, high-memory, ARM) for specific applications

Taints solve this by marking a node as undesirable. Tolerations solve the inverse by granting specific Pods permission to ignore the mark.

Key Insight: Taints are a node-level property. Tolerations are a Pod-level property. The scheduler evaluates the pair during its filtering phase. If a node is tainted and the Pod does not tolerate it, the node is discarded from the candidate list. Source: CKA Day 14

Taint Syntax and Effects

A taint is a key-value-effect triple applied to a node:

kubectl taint node <node-name> <key>=<value>:<effect>
EffectNew SchedulingExisting PodsTypical Use Case
NoScheduleBlockedUnaffectedDedicated hardware, control plane isolation
PreferNoScheduleAvoided (soft)UnaffectedBest-effort separation, hints to scheduler
NoExecuteBlockedEvictedNode drain, maintenance, automatic eviction on failure

Imperative Examples

# Reserve a node for GPU workloads
kubectl taint node worker-gpu-1 gpu=true:NoSchedule
 
# Mark a node for maintenance — evict everything that doesn't tolerate it
kubectl taint node worker-2 maintenance=true:NoExecute
 
# Remove a taint (append minus sign to the full taint string)
kubectl taint node worker-gpu-1 gpu=true:NoSchedule-
 
# Taint multiple nodes by label
kubectl taint nodes -l tier=frontend critical-only=true:NoSchedule

Viewing Taints

kubectl describe node <node-name> | grep -A 5 Taints
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints

Toleration Syntax

A toleration is declared inside the Pod spec. It must match the taint’s key, value (or use Exists), and effect.

Equal Operator (Exact Match)

apiVersion: v1
kind: Pod
metadata:
  name: gpu-training
spec:
  tolerations:
  - key: "gpu"
    operator: "Equal"
    value: "true"
    effect: "NoSchedule"
  containers:
  - name: trainer
    image: tensorflow/tensorflow:latest-gpu

Exists Operator (Any Value)

spec:
  tolerations:
  - key: "special"
    operator: "Exists"
    effect: "NoSchedule"

With Exists, only the key and effect matter; the value is ignored. This is useful when you want to tolerate any value of a given taint key.

Toleration Fields Reference

FieldRequired?Purpose
keyYesThe taint key to match
operatorYesEqual or Exists
valueOnly for EqualThe taint value to match
effectNoIf omitted, matches any effect
tolerationSecondsOnly for NoExecuteGrace period before eviction when toleration exists but node condition persists

Built-In Taints You Must Know

Kubernetes automatically applies certain taints based on node conditions. These are critical for troubleshooting.

Taint KeyEffectApplied ByMeaning
node-role.kubernetes.io/control-planeNoSchedulekubeadmControl plane node — keep user workloads away
node.kubernetes.io/not-readyNoScheduleNode ControllerNode is unhealthy — don’t schedule new Pods
node.kubernetes.io/unreachableNoExecuteNode ControllerNode is unreachable — evict existing Pods
node.kubernetes.io/out-of-diskNoScheduleNode ControllerNode is out of disk space
node.kubernetes.io/memory-pressureNoScheduleNode ControllerNode is under memory pressure
node.kubernetes.io/disk-pressureNoScheduleNode ControllerNode is under disk pressure
node.kubernetes.io/network-unavailableNoScheduleNode Controller (cloud)Node has no network configured

CKA Exam Trap: If a node is NotReady, the scheduler will not place new Pods there because of the node.kubernetes.io/not-ready:NoSchedule taint. But existing Pods will stay until the node becomes Unreachable and the NoExecute taint triggers eviction (subject to tolerationSeconds).

The tolerationSeconds Escape Hatch

When a Pod tolerates a NoExecute taint, you can specify how long it is allowed to stay before being evicted:

tolerations:
- key: "node.kubernetes.io/unreachable"
  operator: "Exists"
  effect: "NoExecute"
  tolerationSeconds: 300   # 5 minutes grace period

This is how Kubernetes implements Pod Disruption Budgets and graceful eviction. The default for built-in NoExecute taints is typically 300 seconds (5 minutes) if not overridden.

DaemonSets and Taints

DaemonSets are designed to run one Pod per node. But control plane nodes carry the control-plane taint. If your DaemonSet does not include a toleration for this taint, it will silently skip master nodes.

# Excerpt from kube-proxy DaemonSet tolerations
tolerations:
- key: "node-role.kubernetes.io/control-plane"
  operator: "Exists"
  effect: "NoSchedule"
- key: "node.kubernetes.io/not-ready"
  operator: "Exists"
  effect: "NoExecute"

When creating a custom DaemonSet (e.g., for log collection or monitoring), always evaluate whether it needs to run on control plane nodes and add the appropriate tolerations. Source: CKA Day 12

Taints + Node Affinity: The Production Pattern

In production, taints/tolerations are rarely used alone. They are combined with nodeSelector or nodeAffinity to create dedicated node pools. See the dedicated Kubernetes Node Affinity page for the full deep-dive with YAML anatomy, operator reference, and troubleshooting.

  1. Label the node poolkubectl label node -l gpu=true tier=ml
  2. Taint the node poolkubectl taint node -l gpu=true gpu=true:NoSchedule
  3. Target workloads with affinity — Pod spec uses nodeAffinity to require GPU nodes (requiredDuringSchedulingIgnoredDuringExecution)
  4. Grant exemption with toleration — Pod spec tolerates the gpu taint

This two-step mechanism ensures that:

  • Only ML workloads land on GPU nodes (affinity attracts)
  • And only ML workloads can land on GPU nodes (taints repel everything else)

Why both are needed: Taints alone cannot guarantee that a tolerated Pod lands on a specific node type — they merely grant permission to land on tainted nodes. Without affinity, the scheduler could place the tolerated Pod on any untainted node instead. Affinity actively pulls the Pod toward the desired pool. Source: CKA Day 15

Troubleshooting Taint Mismatches

SymptomLikely CauseFix
Pod stuck Pending with 0/X nodes are available: X node(s) had taint ...Pod lacks toleration for a tainted nodeAdd matching tolerations to Pod spec
DaemonSet Pod missing on control plane nodesMissing control-plane taint tolerationAdd toleration for node-role.kubernetes.io/control-plane
Pod evicted unexpectedlyNode acquired NoExecute taint (drain, failure)Add toleration or use tolerationSeconds to extend grace period
Workload scheduled on GPU node unexpectedlyNode is not taintedApply NoSchedule taint to reserve the node

CKA Exam Speed Patterns

# Taint a node imperatively
kubectl taint node worker-1 gpu=true:NoSchedule
 
# Remove a taint
kubectl taint node worker-1 gpu=true:NoSchedule-
 
# Run a Pod with toleration (imperative override)
kubectl run debug --image=busybox --restart=Never \
  --overrides='{"spec":{"tolerations":[{"key":"gpu","operator":"Equal","value":"true","effect":"NoSchedule"}]}}'
 
# Check why a Pod is Pending
kubectl describe pod <name> | grep -A 10 Events

YAML Memory Trick: The toleration struct mirrors the taint exactly: key, value, effect. Only the operator (Equal vs Exists) and the optional tolerationSeconds are extra. On the exam, write the taint string first, then copy the key-value-effect into the toleration block.


Tags: kubernetes taints tolerations scheduling node-management cka devops kube-scheduler