Kubernetes Security Contexts and Pod Security Standards

A Pod is a wrapper around one or more containers. By default, a container in Kubernetes inherits very permissive runtime settings — it can run as root, write to the host filesystem, and even escape to the underlying node under certain conditions. Security Contexts and Pod Security Standards are the two mechanisms Kubernetes provides to tighten these defaults and enforce safe, predictable behaviour.

What Is a Security Context?

A Security Context is a set of Linux-level settings applied to a Pod or a specific container at runtime. These settings control things like which user the process runs as, whether the root filesystem is read-only, and which Linux capabilities the process holds.

Think of it like setting the rules before handing a guest the keys to your house: "You can use the living room and kitchen, but you cannot enter the server room or change the locks."

Where Security Contexts Live

┌────────────────────────────────────────────────┐
│                   Pod Spec                     │
│                                                │
│  spec:                                         │
│    securityContext:       ← Pod-level          │
│      runAsUser: 1000                           │
│      fsGroup: 2000                             │
│                                                │
│    containers:                                 │
│    - name: app                                 │
│      securityContext:     ← Container-level    │
│        allowPrivilegeEscalation: false         │
│        readOnlyRootFilesystem: true            │
└────────────────────────────────────────────────┘

Pod-level settings apply to all containers in the Pod. Container-level settings apply only to that specific container and override pod-level settings when both are set for the same field.

Pod-Level Security Context Fields

runAsUser and runAsGroup

These set the UID and GID the container process runs as. If your container image expects to run as root (UID 0), and you set runAsUser: 1000, the process starts as a non-root user instead.

spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000

Running as a non-root user is one of the single most effective security improvements you can make. A vulnerability that gives an attacker code execution inside the container gives them far less power when the process runs as UID 1000 compared to UID 0.

fsGroup

When a Pod mounts a volume, fsGroup sets the group ownership of files on that volume. All processes in the Pod inherit this group, making it possible for multiple containers in a Pod to share mounted files with correct permissions.

spec:
  securityContext:
    fsGroup: 2000

runAsNonRoot

Setting runAsNonRoot: true makes Kubernetes refuse to start the container if the image would run as root. This acts as a safety check — even if someone forgets to set runAsUser, the Pod fails to start rather than silently running as root.

spec:
  securityContext:
    runAsNonRoot: true

seccompProfile

Seccomp (Secure Computing Mode) restricts which Linux system calls a container can make. The RuntimeDefault profile applies the container runtime's recommended syscall filter, blocking a large set of dangerous calls with no application configuration needed.

spec:
  securityContext:
    seccompProfile:
      type: RuntimeDefault

Container-Level Security Context Fields

allowPrivilegeEscalation

This setting controls whether a process can gain more privileges than its parent. Always set this to false. Without it, a process inside the container could use setuid binaries or other techniques to escalate to root even if the container started as a non-root user.

containers:
- name: app
  securityContext:
    allowPrivilegeEscalation: false

readOnlyRootFilesystem

Setting this to true mounts the container's root filesystem as read-only. The application cannot write to /etc, /usr, /bin, or any other system directory. If an attacker gains code execution, they cannot drop new binaries or modify system files.

Applications that need to write files should use explicitly mounted volumes (emptyDir or PVCs) rather than the root filesystem.

containers:
- name: app
  securityContext:
    readOnlyRootFilesystem: true
  volumeMounts:
  - mountPath: /tmp
    name: tmp-dir
volumes:
- name: tmp-dir
  emptyDir: {}

capabilities: drop and add

Linux capabilities divide the traditional all-or-nothing root privileges into smaller, individually grantable permissions. A container starts with a default set of capabilities. You should drop all of them and add back only what the application genuinely needs.

containers:
- name: app
  securityContext:
    capabilities:
      drop: ["ALL"]
      add: ["NET_BIND_SERVICE"]  # Only if app binds to port < 1024

Dropping ALL capabilities removes dangerous abilities like loading kernel modules, changing network settings, and bypassing file permission checks. Most web applications, APIs, and workers need zero capabilities when running as a non-root user on ports above 1024.

privileged

A privileged container has almost all the same capabilities as the host root user. It can see and modify host devices, load kernel modules, and break out of container isolation. Never set privileged: true in production unless you are running a very specific low-level system tool.

A Hardened Container: Full Example

apiVersion: v1
kind: Pod
metadata:
  name: secure-app
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    runAsGroup: 1000
    fsGroup: 2000
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: app
    image: myapp:1.0
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop: ["ALL"]
    volumeMounts:
    - mountPath: /tmp
      name: tmp-dir
    - mountPath: /app/cache
      name: cache-dir
  volumes:
  - name: tmp-dir
    emptyDir: {}
  - name: cache-dir
    emptyDir: {}

This Pod runs as UID 1000, cannot escalate privileges, has a read-only root filesystem, holds zero Linux capabilities, and uses the default seccomp profile. This is a solid baseline for any production workload.

Pod Security Standards: Cluster-Wide Enforcement

Security Contexts are powerful, but they rely on developers setting them correctly per Pod. Pod Security Standards (PSS) are a higher-level mechanism that lets cluster operators enforce a minimum security baseline across an entire namespace — automatically, without trusting developers to remember.

PSS replaced the older Pod Security Policy (PSP) feature, which was removed in Kubernetes 1.25.

The Three Security Profiles

┌─────────────────────────────────────────────────────────────┐
│              Pod Security Standard Profiles                 │
│                                                             │
│  PRIVILEGED          BASELINE             RESTRICTED        │
│  ───────────         ────────             ──────────        │
│  No restrictions     Blocks known         Heavily locked    │
│                      escalation vectors   down; enforces    │
│  Use for:            Use for:             best practices    │
│  System-level        General workloads    Use for:          │
│  tools (CNI,         that don't need      Internet-facing   │
│  storage drivers)    root or host access  apps, user data   │
└─────────────────────────────────────────────────────────────┘

Privileged Profile

No restrictions. Any Pod configuration is accepted. Use this only for infrastructure namespaces that run DaemonSets for networking, storage, or logging — components that genuinely need host-level access.

Baseline Profile

Blocks the most dangerous settings: privileged containers, host network/PID/IPC namespaces, dangerous host volume types, and unsafe capabilities. Most workloads pass this profile without any changes.

Restricted Profile

Enforces the full set of security best practices. Pods must run as non-root, must drop all capabilities, must use a seccomp profile, and cannot use any host namespaces or volumes. This is the profile to target for any public-facing application.

Applying Pod Security Standards to a Namespace

PSS works via labels on the namespace. You set the profile and the mode (enforce, audit, or warn).

apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

The Three Modes

enforce — Pods that violate the policy are rejected. They will not start.
audit — Violations are recorded in the audit log but the Pod still starts. Use this to discover violations before enforcing.
warn — A warning message is returned to the user when they apply a violating manifest. The Pod still starts. Useful as a gentle reminder during migration.

A recommended rollout strategy: start with warn and audit for a few weeks. Fix the workloads that trigger warnings. Then switch to enforce when the namespace is clean.

PSS Migration: From No Policy to Restricted

Stage 1: Discover
  Label namespace with audit + warn at "restricted"
  Watch audit logs and kubectl warnings for 1–2 weeks

Stage 2: Fix
  Update Deployments/StatefulSets to pass restricted checks
  Add securityContext fields, switch to non-root users

Stage 3: Enforce
  Add enforce label at "restricted"
  Non-compliant Pods are now blocked at admission

Host Namespaces: What to Block and Why

Kubernetes Pods can optionally share namespaces with the host node. Each of these is a serious security risk:

hostNetwork: true — The Pod sees and can bind to the host's network interfaces. A compromised Pod can sniff all traffic on the node.
hostPID: true — The Pod can see all processes running on the host, including other Pods and system processes. It can send signals to them.
hostIPC: true — The Pod shares the host's inter-process communication namespace and can communicate with host processes.

The Baseline and Restricted PSS profiles block all three. Only the Privileged profile allows them, and they should appear only in DaemonSets for approved system tools.

Volume Security: hostPath Is Dangerous

A hostPath volume mounts a directory from the host node directly into the Pod. If the application writes to /etc or the container runs as root, it can modify host configuration files. The Baseline profile blocks the most dangerous hostPath targets; the Restricted profile blocks all hostPath volumes.

Use emptyDir, PersistentVolumeClaims, or ConfigMap/Secret mounts instead of hostPath for application workloads.

Checking Compliance Before Enforcing

# Dry-run: what would happen if you enforced "restricted" on a namespace?
kubectl label namespace production \
  pod-security.kubernetes.io/enforce=restricted \
  --dry-run=server

# Check an existing namespace for violations
kubectl --dry-run=server apply -f my-deployment.yaml \
  --namespace production

Key Points

A Security Context applies Linux runtime settings — user, capabilities, filesystem access — to a Pod or container.
Run containers as a non-root user, set allowPrivilegeEscalation: false, enable readOnlyRootFilesystem, and drop all capabilities for every production workload.
Pod Security Standards enforce a minimum security profile at the namespace level via labels.
The three PSS profiles are Privileged (no restrictions), Baseline (blocks known exploits), and Restricted (enforces best practices).
Use warn and audit modes to discover violations before switching to enforce.
Avoid hostNetwork, hostPID, hostIPC, and hostPath in application workloads.

Previous lesson

Back to course

Next lesson