Kubernetes Taints and Tolerations
The negative scheduling primitive that repels Pods from unsuitable nodes. Taints are the “Do Not Disturb” signs on nodes; tolerations are the exemptions granted to Pods. Critical for node isolation, maintenance windows, control plane protection, and the CKA exam. Synthesized from CKA Day 14 — Taints and Tolerations in Kubernetes.
What Are Taints and Tolerations?
By default, the Kubernetes scheduler tries to spread Pods evenly across healthy nodes. But not every node is appropriate for every workload. You may want to:
- Keep user workloads off control plane nodes
- Reserve GPU nodes for ML training jobs
- Drain a node for maintenance without manual Pod deletion
- Isolate dedicated hardware (SSD, high-memory, ARM) for specific applications
Taints solve this by marking a node as undesirable. Tolerations solve the inverse by granting specific Pods permission to ignore the mark.
Key Insight: Taints are a node-level property. Tolerations are a Pod-level property. The scheduler evaluates the pair during its filtering phase. If a node is tainted and the Pod does not tolerate it, the node is discarded from the candidate list. Source: CKA Day 14
Taint Syntax and Effects
A taint is a key-value-effect triple applied to a node:
kubectl taint node <node-name> <key>=<value>:<effect>| Effect | New Scheduling | Existing Pods | Typical Use Case |
|---|---|---|---|
NoSchedule | Blocked | Unaffected | Dedicated hardware, control plane isolation |
PreferNoSchedule | Avoided (soft) | Unaffected | Best-effort separation, hints to scheduler |
NoExecute | Blocked | Evicted | Node drain, maintenance, automatic eviction on failure |
Imperative Examples
# Reserve a node for GPU workloads
kubectl taint node worker-gpu-1 gpu=true:NoSchedule
# Mark a node for maintenance — evict everything that doesn't tolerate it
kubectl taint node worker-2 maintenance=true:NoExecute
# Remove a taint (append minus sign to the full taint string)
kubectl taint node worker-gpu-1 gpu=true:NoSchedule-
# Taint multiple nodes by label
kubectl taint nodes -l tier=frontend critical-only=true:NoScheduleViewing Taints
kubectl describe node <node-name> | grep -A 5 Taints
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taintsToleration Syntax
A toleration is declared inside the Pod spec. It must match the taint’s key, value (or use Exists), and effect.
Equal Operator (Exact Match)
apiVersion: v1
kind: Pod
metadata:
name: gpu-training
spec:
tolerations:
- key: "gpu"
operator: "Equal"
value: "true"
effect: "NoSchedule"
containers:
- name: trainer
image: tensorflow/tensorflow:latest-gpuExists Operator (Any Value)
spec:
tolerations:
- key: "special"
operator: "Exists"
effect: "NoSchedule"With Exists, only the key and effect matter; the value is ignored. This is useful when you want to tolerate any value of a given taint key.
Toleration Fields Reference
| Field | Required? | Purpose |
|---|---|---|
key | Yes | The taint key to match |
operator | Yes | Equal or Exists |
value | Only for Equal | The taint value to match |
effect | No | If omitted, matches any effect |
tolerationSeconds | Only for NoExecute | Grace period before eviction when toleration exists but node condition persists |
Built-In Taints You Must Know
Kubernetes automatically applies certain taints based on node conditions. These are critical for troubleshooting.
| Taint Key | Effect | Applied By | Meaning |
|---|---|---|---|
node-role.kubernetes.io/control-plane | NoSchedule | kubeadm | Control plane node — keep user workloads away |
node.kubernetes.io/not-ready | NoSchedule | Node Controller | Node is unhealthy — don’t schedule new Pods |
node.kubernetes.io/unreachable | NoExecute | Node Controller | Node is unreachable — evict existing Pods |
node.kubernetes.io/out-of-disk | NoSchedule | Node Controller | Node is out of disk space |
node.kubernetes.io/memory-pressure | NoSchedule | Node Controller | Node is under memory pressure |
node.kubernetes.io/disk-pressure | NoSchedule | Node Controller | Node is under disk pressure |
node.kubernetes.io/network-unavailable | NoSchedule | Node Controller (cloud) | Node has no network configured |
CKA Exam Trap: If a node is
NotReady, the scheduler will not place new Pods there because of thenode.kubernetes.io/not-ready:NoScheduletaint. But existing Pods will stay until the node becomesUnreachableand theNoExecutetaint triggers eviction (subject totolerationSeconds).
The tolerationSeconds Escape Hatch
When a Pod tolerates a NoExecute taint, you can specify how long it is allowed to stay before being evicted:
tolerations:
- key: "node.kubernetes.io/unreachable"
operator: "Exists"
effect: "NoExecute"
tolerationSeconds: 300 # 5 minutes grace periodThis is how Kubernetes implements Pod Disruption Budgets and graceful eviction. The default for built-in NoExecute taints is typically 300 seconds (5 minutes) if not overridden.
DaemonSets and Taints
DaemonSets are designed to run one Pod per node. But control plane nodes carry the control-plane taint. If your DaemonSet does not include a toleration for this taint, it will silently skip master nodes.
# Excerpt from kube-proxy DaemonSet tolerations
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
- key: "node.kubernetes.io/not-ready"
operator: "Exists"
effect: "NoExecute"When creating a custom DaemonSet (e.g., for log collection or monitoring), always evaluate whether it needs to run on control plane nodes and add the appropriate tolerations. Source: CKA Day 12
Taints + Node Affinity: The Production Pattern
In production, taints/tolerations are rarely used alone. They are combined with nodeSelector or nodeAffinity to create dedicated node pools. See the dedicated Kubernetes Node Affinity page for the full deep-dive with YAML anatomy, operator reference, and troubleshooting.
- Label the node pool —
kubectl label node -l gpu=true tier=ml - Taint the node pool —
kubectl taint node -l gpu=true gpu=true:NoSchedule - Target workloads with affinity — Pod spec uses
nodeAffinityto require GPU nodes (requiredDuringSchedulingIgnoredDuringExecution) - Grant exemption with toleration — Pod spec tolerates the
gputaint
This two-step mechanism ensures that:
- Only ML workloads land on GPU nodes (affinity attracts)
- And only ML workloads can land on GPU nodes (taints repel everything else)
Why both are needed: Taints alone cannot guarantee that a tolerated Pod lands on a specific node type — they merely grant permission to land on tainted nodes. Without affinity, the scheduler could place the tolerated Pod on any untainted node instead. Affinity actively pulls the Pod toward the desired pool. Source: CKA Day 15
Troubleshooting Taint Mismatches
| Symptom | Likely Cause | Fix |
|---|---|---|
Pod stuck Pending with 0/X nodes are available: X node(s) had taint ... | Pod lacks toleration for a tainted node | Add matching tolerations to Pod spec |
| DaemonSet Pod missing on control plane nodes | Missing control-plane taint toleration | Add toleration for node-role.kubernetes.io/control-plane |
| Pod evicted unexpectedly | Node acquired NoExecute taint (drain, failure) | Add toleration or use tolerationSeconds to extend grace period |
| Workload scheduled on GPU node unexpectedly | Node is not tainted | Apply NoSchedule taint to reserve the node |
CKA Exam Speed Patterns
# Taint a node imperatively
kubectl taint node worker-1 gpu=true:NoSchedule
# Remove a taint
kubectl taint node worker-1 gpu=true:NoSchedule-
# Run a Pod with toleration (imperative override)
kubectl run debug --image=busybox --restart=Never \
--overrides='{"spec":{"tolerations":[{"key":"gpu","operator":"Equal","value":"true","effect":"NoSchedule"}]}}'
# Check why a Pod is Pending
kubectl describe pod <name> | grep -A 10 EventsYAML Memory Trick: The toleration struct mirrors the taint exactly:
key,value,effect. Only theoperator(EqualvsExists) and the optionaltolerationSecondsare extra. On the exam, write the taint string first, then copy the key-value-effect into the toleration block.
Related Pages
- Kubernetes Manual Scheduling —
nodeName,nodeSelector, andnodeAffinityalongside taints - Kubernetes Node Affinity — the advanced positive scheduling primitive; essential companion to taints for dedicated node pools
- Kubernetes DaemonSet — toleration patterns for node-level agents
- Kubernetes Architecture — kube-scheduler filtering and node controller taint application
- Pod Fundamentals — the object that carries tolerations
- Kubernetes Labels and Selectors — the positive scheduling counterpart
- Deployment, ReplicaSet & Replication Controller — controllers that respect taints during scheduling
- Kubernetes Static Pods — node-local Pod management, bypasses scheduler entirely
- Kubernetes Services — routing is unaffected by taints; only placement is constrained
- Kubernetes Namespaces — taints are cluster-wide node properties, not namespace-scoped
- CKA Certification — exam domains and weightings
- CKA Study Roadmap — Day 14 in the 40-day plan
- Tech Tutorials with Piyush — course source
Tags: kubernetes taints tolerations scheduling node-management cka devops kube-scheduler