Kubernetes StatefulSets Running Databases

Deployments work great for stateless applications — web servers, APIs, and microservices where every Pod is identical and interchangeable. Databases and other stateful workloads need something different. Each instance needs a stable identity, stable storage, and a predictable startup order. StatefulSets provide all of this.

Why Databases Need Different Treatment

Imagine a team of three database replicas. Each one has a specific role: one is the primary that accepts writes, and two are replicas that only accept reads. If Kubernetes replaces any one of them with a Pod that has a different name and different storage, the replica set breaks — the primary loses track of which replicas are following it.

StatefulSet: mysql
Pod 0: mysql-0 → /data/mysql-0  (Primary, stable name)
Pod 1: mysql-1 → /data/mysql-1  (Replica 1, stable name)
Pod 2: mysql-2 → /data/mysql-2  (Replica 2, stable name)

If mysql-1 crashes → restarted as mysql-1 again
                    → mounts /data/mysql-1 again
                    → same identity restored

What StatefulSets Guarantee

A StatefulSet gives each Pod three stable properties that regular Deployments do not:

  • Stable network identity — Each Pod gets a predictable, permanent DNS name: pod-name-0, pod-name-1, etc.
  • Stable storage — Each Pod gets its own Persistent Volume Claim. Storage is not shared among Pods.
  • Ordered deployment and scaling — Pods start in order (0, 1, 2...) and terminate in reverse order (2, 1, 0).

Writing a StatefulSet YAML

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  serviceName: "mysql"
  replicas: 3
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        env:
        - name: MYSQL_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-secret
              key: password
        ports:
        - containerPort: 3306
        volumeMounts:
        - name: mysql-data
          mountPath: /var/lib/mysql
  volumeClaimTemplates:
  - metadata:
      name: mysql-data
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: standard
      resources:
        requests:
          storage: 20Gi

The key difference is volumeClaimTemplates — Kubernetes automatically creates one PVC per Pod. mysql-data-mysql-0, mysql-data-mysql-1, and mysql-data-mysql-2 are three separate disks, one for each Pod.

The Headless Service Requirement

StatefulSets require a special kind of Service called a Headless Service. A regular Service has a single ClusterIP that load-balances traffic. A Headless Service has no IP — it lets clients connect to a specific Pod by its stable DNS name.

apiVersion: v1
kind: Service
metadata:
  name: mysql
spec:
  clusterIP: None       # This makes it headless
  selector:
    app: mysql
  ports:
  - port: 3306

With a Headless Service named mysql in the default namespace, each Pod gets its own DNS record:

mysql-0.mysql.default.svc.cluster.local
mysql-1.mysql.default.svc.cluster.local
mysql-2.mysql.default.svc.cluster.local

Your application can connect to the primary specifically (mysql-0) and to replicas by their stable names, regardless of how many times they restart.

Ordered Startup and Shutdown

Kubernetes starts StatefulSet Pods sequentially. It starts Pod 0 and waits for it to become ready before starting Pod 1, then waits again before starting Pod 2. This ordering matters for databases that require the primary to initialize before replicas join.

mysql-0 starts → Ready
     ↓
mysql-1 starts → Ready
     ↓
mysql-2 starts → Ready

Scale down: mysql-2 deleted first, then mysql-1, then mysql-0

StatefulSet vs. Deployment: The Key Differences

FeatureDeploymentStatefulSet
Pod identityRandom generated nameStable ordered name (pod-0, pod-1)
StorageShared or ephemeralDedicated PVC per Pod
Startup orderAll at onceSequential (0 → 1 → 2)
Shutdown orderAny orderReverse sequential (2 → 1 → 0)
DNSLoad-balanced Service IPPer-Pod stable DNS name
Use forWeb apps, APIs, workersDatabases, message queues, Zookeeper

Common StatefulSet Use Cases

  • Relational databases: MySQL, PostgreSQL with primary-replica setup
  • NoSQL databases: MongoDB replica sets, Cassandra rings
  • Message queues: Kafka brokers, RabbitMQ clusters
  • Coordination services: ZooKeeper, etcd clusters
  • Search engines: Elasticsearch nodes

Updating a StatefulSet

StatefulSets support rolling updates but they update Pods in reverse order — Pod 2 first, then Pod 1, then Pod 0. This keeps the primary (Pod 0) running on the old version longest, giving you time to verify the new version on replicas before it reaches the primary.

kubectl set image statefulset/mysql mysql=mysql:8.1
kubectl rollout status statefulset/mysql

When Not to Use StatefulSets

StatefulSets add complexity. If your application writes data to an external managed database (like Amazon RDS or Google Cloud SQL) and the app itself is stateless, use a regular Deployment. StatefulSets are for running the database engine itself inside Kubernetes.

Key Points

  • StatefulSets give each Pod a stable name, stable storage, and ordered lifecycle — things Deployments cannot provide.
  • Each Pod in a StatefulSet gets its own dedicated PVC via volumeClaimTemplates.
  • A Headless Service (clusterIP: None) creates per-Pod DNS names for direct connections.
  • Pods start in order (0, 1, 2) and shut down in reverse (2, 1, 0).
  • Use StatefulSets for databases, message queues, and coordination services. Use Deployments for everything stateless.

Leave a Comment