How Kubernetes Architecture Works

Kubernetes has a clear structure. Every piece has a specific job, and they all work together to keep your applications running. Understanding the architecture helps you troubleshoot problems, design systems, and talk confidently with other engineers.

The Two Sides of a Kubernetes Cluster

A Kubernetes cluster has two types of machines — a Control Plane and Worker Nodes. The control plane is the brain. Worker nodes are the muscles. The brain decides what to do, and the muscles do the actual work.

┌──────────────────────────────────────────────────────────┐
│                    KUBERNETES CLUSTER                    │
│                                                          │
│  ┌─────────────────────┐    ┌────────┐  ┌────────┐       │
│  │    CONTROL PLANE    │    │ Node 1 │  │ Node 2 │       │
│  │  ┌───────────────┐  │    │ [Pod]  │  │ [Pod]  │       │
│  │  │  API Server   │  │◄──►│ [Pod]  │  │ [Pod]  │       │
│  │  │  Scheduler    │  │    └────────┘  └────────┘       │
│  │  │  Controller   │  │                                 │
│  │  │  etcd         │  │    ┌────────┐                   │
│  │  └───────────────┘  │    │ Node 3 │                   │
│  └─────────────────────┘    │ [Pod]  │                   │
│                             └────────┘                   │
└──────────────────────────────────────────────────────────┘

The Control Plane Components

The control plane runs on a dedicated set of machines. It never runs your actual application workloads — it only manages the cluster.

API Server

The API Server is the front door of Kubernetes. Every request — from you, from other components, from automated tools — goes through the API Server. When you type a kubectl command, it talks to the API Server first. Nothing in the cluster changes without the API Server approving it.

etcd

etcd is the cluster's memory. It stores the current state of everything in the cluster — which Pods are running, what configurations exist, what secrets are stored. If etcd is lost without a backup, the cluster loses its memory. Most teams back up etcd regularly, just like a database backup.

Scheduler

The Scheduler decides which worker node gets each new Pod. It checks each node's available CPU and memory, then picks the best fit. Think of it as a hotel booking system — it finds the best room for each guest based on availability and preferences.

Controller Manager

The Controller Manager runs a set of loops that watch the cluster and fix problems. If you ask for 3 replicas of an app and one crashes, the Controller Manager notices the gap and creates a replacement. It compares the desired state (what you want) with the actual state (what is running) and closes the difference.

Worker Node Components

Worker nodes are the machines that actually run your applications. Each node has three key components.

kubelet

kubelet is an agent that runs on every worker node. It talks to the API Server, receives instructions about which Pods to run, and makes sure those Pods are actually running. If a container crashes on its node, kubelet restarts it immediately.

kube-proxy

kube-proxy handles networking on each node. It maintains network rules so that traffic reaches the right Pod, whether the request comes from inside the cluster or from the internet. It is the traffic cop at each intersection.

Container Runtime

The container runtime is the software that actually runs containers. Kubernetes supports several runtimes — the most common is containerd. Docker uses containerd under the hood as well. The runtime pulls the container image and starts the process.

Worker Node
┌──────────────────────────────────┐
│  kubelet   ←  API Server         │
│  kube-proxy → routes traffic     │
│  containerd → runs containers    │
│                                  │
│  ┌──────┐  ┌──────┐  ┌──────┐    │
│  │ Pod  │  │ Pod  │  │ Pod  │    │
│  └──────┘  └──────┘  └──────┘    │
└──────────────────────────────────┘

How a Request Flows Through the Architecture

Here is what happens when you deploy a new application:

You run kubectl apply -f app.yaml on your laptop.
kubectl sends the request to the API Server.
The API Server validates your request and saves the desired state in etcd.
The Scheduler notices a new Pod needs placement and picks a worker node.
The Controller Manager ensures the right number of Pods keep running.
The kubelet on the chosen node receives the Pod spec and tells the container runtime to pull the image and start the container.
Your app is now running.

You → kubectl → API Server → etcd (save state)
                     ↓
                 Scheduler (pick node)
                     ↓
               kubelet on Node (run Pod)
                     ↓
            Container starts, app is live

Desired State vs. Actual State

Kubernetes works on a principle called desired state management. You tell Kubernetes what you want — "run 3 copies of this app." Kubernetes stores that as the desired state. It constantly watches the actual state. When actual state drifts from desired state, Kubernetes corrects it automatically.

This is like a thermostat. You set it to 22°C. If the room drops to 19°C, the heater turns on. You do not press a button — the system fixes itself.

Single Control Plane vs. High Availability Setup

In a learning or development environment, one control plane machine is fine. In production, teams run three or five control plane machines so that if one fails, the cluster keeps working. etcd also runs on all control plane machines and stays in sync. This setup is called a high-availability (HA) cluster.

Key Points

The control plane is the brain — it manages the cluster but does not run your apps.
Worker nodes are the muscles — they run your containers inside Pods.
The API Server is the single entry point for all cluster communication.
etcd stores the entire cluster state — back it up regularly.
Kubernetes constantly compares desired state with actual state and fixes any gaps automatically.

Previous lesson

Back to course

Next lesson