Kubernetes Node Affinity and Taints Controlling Pod Placement
The Kubernetes Scheduler automatically picks a node for each Pod based on available resources. Sometimes you need more control — you want GPU-intensive Pods on nodes that have GPUs, or you want to keep production workloads off shared testing nodes. Node Affinity and Taints give you precise control over where Pods land.
Three Mechanisms for Placement Control
| Mechanism | Direction | Effect |
|---|---|---|
| nodeSelector | Pod → Node (simple) | Pod requires a specific node label |
| Node Affinity | Pod → Node (advanced) | Pod prefers or requires certain node labels |
| Taints and Tolerations | Node → Pod | Node repels Pods unless they tolerate the taint |
nodeSelector: The Simple Approach
nodeSelector is the easiest way to pin Pods to specific nodes. Label a node, then tell your Pod to only run on nodes with that label.
# Label a node
kubectl label node gpu-node-1 hardware=gpu
# In the Pod spec
spec:
nodeSelector:
hardware: gpu
containers:
- name: ml-job
image: tensorflow:latest
The Pod only schedules on nodes with the label hardware=gpu. If no such node exists or all matching nodes are full, the Pod stays Pending.
Node Affinity: More Expressive Rules
Node Affinity builds on nodeSelector with operators like In, NotIn, Exists, Gt (greater than), and two modes — required and preferred.
Required Affinity (Hard Rule)
The Pod must schedule on a matching node. If no matching node exists, the Pod waits in Pending status.
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: zone
operator: In
values:
- us-east-1a
- us-east-1b
This Pod only runs in zones us-east-1a or us-east-1b.
Preferred Affinity (Soft Rule)
Kubernetes tries to place the Pod on a matching node but schedules it elsewhere if no match is available. You assign a weight (1–100) to indicate preference strength.
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 80
preference:
matchExpressions:
- key: disk-type
operator: In
values:
- ssd
- weight: 20
preference:
matchExpressions:
- key: disk-type
operator: In
values:
- hdd
The Scheduler strongly prefers SSD nodes (weight 80) but accepts HDD nodes (weight 20) as a fallback.
Taints and Tolerations: Node Repels Pods
A taint is applied to a node. It says: "No regular Pods may run here." A toleration is applied to a Pod. It says: "I accept this node's taint — schedule me here anyway."
Think of a taint like a "No Unauthorized Personnel" sign on a server room door. A toleration is the access badge that lets authorized staff (specific Pods) enter.
# Apply a taint to a node kubectl taint node gpu-node-1 dedicated=gpu-only:NoSchedule # Effect: NoSchedule means no new Pods without a matching toleration will schedule here
Taint Effects
| Effect | Behavior |
|---|---|
| NoSchedule | New Pods without a toleration cannot schedule on this node |
| PreferNoSchedule | Scheduler avoids this node for Pods without toleration but will use it if necessary |
| NoExecute | Existing Pods without toleration are evicted; new ones cannot schedule |
Adding a Toleration to a Pod
spec:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "gpu-only"
effect: "NoSchedule"
containers:
- name: ml-workload
image: my-ml-app
This Pod tolerates the dedicated=gpu-only:NoSchedule taint. It can schedule on the gpu-node-1 even though other Pods cannot.
Combining Taints and Node Affinity
Use taints to repel unwanted Pods from specialized nodes, and use Node Affinity to attract the right Pods to those nodes. Together, they ensure that only the intended workloads run on dedicated hardware.
GPU Node Setup: Taint: dedicated=gpu:NoSchedule ← Repels regular Pods Label: hardware=gpu ← Identifies the node ML Job Pod Setup: Toleration: dedicated=gpu:NoSchedule ← Can schedule on GPU node NodeAffinity: hardware=gpu (required) ← Must schedule on GPU node
Pod Affinity and Anti-Affinity
Pod Affinity and Anti-Affinity control placement relative to other Pods — not nodes.
Pod Affinity
Schedule this Pod on the same node (or zone) as Pods with a certain label. Use this when low latency between two services matters:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: cache
topologyKey: kubernetes.io/hostname
This Pod must run on the same node as Pods labeled app=cache.
Pod Anti-Affinity
Spread Pods across different nodes for high availability. If all replicas of your app land on the same node and that node fails, everything goes down. Anti-affinity prevents that:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: web-frontend
topologyKey: kubernetes.io/hostname
No two Pods labeled app=web-frontend run on the same node. Each replica is on a different node, so any single node failure takes down at most one replica.
Key Points
- nodeSelector is the simplest placement control — a Pod requires specific node labels.
- Node Affinity offers required (hard) and preferred (soft) rules with flexible operators.
- Taints repel Pods from nodes. Tolerations let specific Pods override that repulsion.
- Use taints + node affinity together for dedicated workloads like GPU jobs or database nodes.
- Pod Anti-Affinity spreads replicas across nodes to avoid single-node failure taking down all instances.
