Docker Image Layers How Caching Works

Every time you build a Docker image, Docker looks for shortcuts. If it ran an instruction before and nothing has changed, it reuses the result from last time instead of running it again. This is called the build cache. Understanding it transforms a 3-minute build into a 5-second build.

What a Layer Actually Is

Every instruction in your Dockerfile that changes the filesystem (FROM, RUN, COPY, ADD) creates a new layer. A layer is a compressed snapshot of what changed on the filesystem at that step.

Dockerfile Instructions → Image Layers

FROM python:3.11-slim        → Layer 0 (base, ~125 MB)
WORKDIR /app                 → Layer 1 (directory created, ~0 KB)
COPY requirements.txt .      → Layer 2 (one file added, ~0.5 KB)
RUN pip install -r req...    → Layer 3 (packages installed, ~20 MB)
COPY . .                     → Layer 4 (your app files, ~50 KB)

Total image = sum of all layers

Layers are stacked like pancakes. Each layer only stores the difference from the layer below it — not the entire filesystem.

How the Build Cache Works

Docker assigns a hash to each layer based on the instruction and its inputs. Before running any instruction, Docker checks: "Have I seen this exact instruction with these exact inputs before?" If yes, it uses the cached layer. If no, it runs the instruction and caches the result.

First build (no cache):
Step 1/5: FROM python:3.11-slim        [DOWNLOAD] 12 seconds
Step 2/5: WORKDIR /app                 [RUN]       0.1 seconds
Step 3/5: COPY requirements.txt .      [RUN]       0.1 seconds
Step 4/5: RUN pip install -r req...    [RUN]       45 seconds
Step 5/5: COPY . .                     [RUN]       0.1 seconds
Total: ~57 seconds

Second build (you only changed app.py, nothing else):
Step 1/5: FROM python:3.11-slim        [CACHED]    0.0 seconds
Step 2/5: WORKDIR /app                 [CACHED]    0.0 seconds
Step 3/5: COPY requirements.txt .      [CACHED]    0.0 seconds ← req unchanged
Step 4/5: RUN pip install -r req...    [CACHED]    0.0 seconds ← input unchanged
Step 5/5: COPY . .                     [RUN]       0.1 seconds ← app.py changed
Total: ~0.1 seconds

Cache Invalidation — When Cache Breaks

The cache breaks at the first instruction whose input changes. Every instruction after that point also reruns — even if those later instructions are unchanged.

BAD Dockerfile (cache-unfriendly order):

FROM python:3.11-slim
WORKDIR /app
COPY . .                           ← copies EVERYTHING including app.py
RUN pip install -r requirements.txt ← always reruns when any file changes!

You change app.py → COPY . . invalidates → pip install reruns → 45 seconds wasted
GOOD Dockerfile (cache-friendly order):

FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .            ← only copies one file
RUN pip install -r requirements.txt ← reruns ONLY if requirements.txt changes
COPY . .                           ← copies everything else after pip install

You change app.py → only COPY . . reruns → 0.1 seconds

The Golden Rule of Dockerfile Layer Order

┌─────────────────────────────────────────────────────┐
│ Put instructions that CHANGE RARELY at the TOP      │
│ Put instructions that CHANGE OFTEN at the BOTTOM    │
└─────────────────────────────────────────────────────┘

Changes rarely:
  - Base image (FROM)
  - System package installs (apt-get install)
  - Dependency file copies + installs (pip install, npm install)

Changes often:
  - Your application source code
  - Configuration files you edit frequently

Viewing Image Layers

Inspect the layers of any image:

docker history my-python-app:1.0

IMAGE          CREATED        CREATED BY                          SIZE
a1b2c3d4e5f6   2 minutes ago  CMD ["python" "app.py"]             0B
b2c3d4e5f6g7   2 minutes ago  COPY . . # buildkit                 4.2kB
c3d4e5f6g7h8   3 minutes ago  RUN pip install -r requirements…    21.3MB
d4e5f6g7h8i9   3 minutes ago  COPY requirements.txt .             312B
e5f6g7h8i9j0   3 minutes ago  WORKDIR /app                        0B
f6g7h8i9j0k1   7 days ago     /bin/sh -c #(nop) CMD ["python3"]   0B

Using .dockerignore to Protect Cache

When Docker runs COPY . ., it hashes all the files being copied to check the cache. If any file changes — even a .log file or a node_modules folder — the cache breaks for that instruction and all instructions below it. A .dockerignore file excludes irrelevant files from triggering unnecessary cache misses.

.dockerignore file:
__pycache__
*.pyc
.git
.env
*.log
tests/
README.md

Sharing Layers Between Images

When multiple images share the same base layers, Docker stores those layers once on disk and shares them. Ten images all based on python:3.11-slim do not each store 125 MB — they all point to the same shared layer data. This is why the first docker pull of a new image is slow but subsequent pulls of related images are much faster.

Leave a Comment