Kafka Retention Policies

One of Kafka's most powerful features is that it stores data even after consumers have read it. But data cannot be stored forever — disk space is finite. Retention policies control how long Kafka keeps messages and how it cleans up old data. Understanding retention lets you design topics that match your data requirements — holding data long enough for replay, recovery, and downstream processing without wasting storage resources.

Why Kafka Keeps Data After Consumption

Traditional message queues delete messages the moment they are consumed. Kafka deliberately keeps messages after consumption for several important reasons.

Multiple consumer groups can read the same topic independently. If Kafka deleted messages when one consumer read them, other consumers would never see those messages. Retention allows every consumer group to read at its own pace without interfering with others.

Consumers can replay historical data when needed — after fixing a bug, after adding a new consumer, or after a downstream system failure. Replay is only possible if the data still exists in Kafka. The longer you retain data, the longer your replay window.

Kafka functions as a buffer between producers and consumers. When a consumer goes down temporarily, messages accumulate in Kafka. When the consumer comes back up, it reads the accumulated messages from where it left off. This buffering requires data to persist during the consumer's downtime.

The Two Primary Retention Mechanisms

Kafka provides two main retention policies that work at the partition level. You can use one or both simultaneously — when both are configured, Kafka enforces whichever limit triggers first.

Time-Based Retention

Time-based retention keeps messages for a specified duration. Once a message is older than the retention period, Kafka marks it for deletion at the next log cleanup cycle. Kafka operates on entire segments — it does not delete individual messages within an active segment. A segment gets deleted only when all messages in that segment are older than the retention threshold.

The relevant configuration properties:

log.retention.hours (broker-level default): How many hours to keep messages. Default is 168 hours (7 days).

log.retention.minutes: Overrides hours if set. Takes precedence over hours.

log.retention.ms: Overrides both. Millisecond precision. Takes highest priority.

retention.ms (topic-level override): Sets retention for a specific topic, overriding the broker default. Set to -1 for infinite retention.

TIME-BASED RETENTION EXAMPLE:

Topic: sensor-readings  retention.ms = 86,400,000 (24 hours)

At 9:00 AM on Day 3:

Partition 0 segments:
  Segment A: messages from Day 1, 8:00 AM – Day 1, 11:00 PM  ← ALL > 24h old → DELETE
  Segment B: messages from Day 1, 11:00 PM – Day 2, 10:00 AM ← ALL > 24h old → DELETE
  Segment C: messages from Day 2, 10:00 AM – Day 3, 8:00 AM  ← SOME within 24h → KEEP
  Segment D: messages from Day 3, 8:00 AM – present          ← ACTIVE segment  → KEEP

Segments A and B are deleted. Segments C and D remain.
Even if some messages in Segment C are older than 24h,
Kafka waits until the ENTIRE segment is expired before deleting.

Size-Based Retention

Size-based retention limits the total disk space a topic's partitions can occupy. When a partition exceeds the size limit, Kafka deletes the oldest segments until the partition is back within bounds.

log.retention.bytes (broker-level): Maximum bytes to retain per partition. Default is -1 (unlimited).

retention.bytes (topic-level): Per-partition size limit for this specific topic. Set to -1 for unlimited.

SIZE-BASED RETENTION EXAMPLE:

Topic: raw-logs  retention.bytes = 10,737,418,240 (10 GB per partition)

Partition 0 current segments:
  Segment 1: 2.1 GB  (oldest)
  Segment 2: 2.3 GB
  Segment 3: 2.5 GB
  Segment 4: 2.4 GB
  Segment 5: 2.1 GB (newest)
  Total: 11.4 GB ← OVER 10 GB LIMIT

Action: Delete Segment 1 (2.1 GB)
  New total: 9.3 GB ← within limit

If new messages add another 1 GB, total hits 10.3 GB again:
  Delete Segment 2 (2.3 GB) → total back to 8 GB ← within limit

Using Both Retention Policies Together

When both time-based and size-based retention are configured, Kafka applies whichever one triggers first. This is useful for controlling both disk growth and data freshness simultaneously.

COMBINED RETENTION POLICY:

Topic: app-logs
  retention.ms    = 259,200,000 (3 days)
  retention.bytes = 5,368,709,120 (5 GB per partition)

Scenario A: Logs flow slowly.
  After 3 days, total size is only 1 GB.
  Time-based retention triggers first → old segments deleted at 3-day mark.
  Size-based never triggers (never reaches 5 GB).

Scenario B: Logs spike heavily.
  After 1 day, partition reaches 5 GB.
  Size-based retention triggers first → oldest segments deleted.
  3-day time limit hasn't been reached yet, but size forced cleanup.

Whichever triggers first → cleanup happens.

Log Compaction: An Alternative to Deletion

The two retention policies above both involve deleting old segments. Kafka offers a fundamentally different approach for certain use cases: log compaction. Instead of deleting by time or size, compaction retains only the most recent message for each unique key, no matter how old it is.

Log compaction is designed for topics where you care about current state rather than complete history. Examples include user profile data (keep current profile), device configuration (keep current settings), and entity state (keep current order status).

How Log Compaction Works

Kafka runs a background log cleaner thread that scans the log and identifies messages whose key appears again in a later message. These older messages are redundant — the newer message supersedes them. The compacter removes the redundant messages and rewrites the log.

LOG COMPACTION PROCESS:

BEFORE COMPACTION:
offset: 0  key: user1  value: {name: Alice, status: inactive}
offset: 1  key: user2  value: {name: Bob, status: active}
offset: 2  key: user3  value: {name: Carol, status: active}
offset: 3  key: user1  value: {name: Alice, status: active}   ← newer user1
offset: 4  key: user2  value: {name: Bob, status: suspended}  ← newer user2
offset: 5  key: user1  value: {name: Alice, status: premium}  ← newest user1

AFTER COMPACTION:
offset: 2  key: user3  value: {name: Carol, status: active}   ← kept (only version)
offset: 4  key: user2  value: {name: Bob, status: suspended}  ← kept (latest version)
offset: 5  key: user1  value: {name: Alice, status: premium}  ← kept (latest version)

Old offsets 0, 1, 3 removed (superseded by newer messages for same key).
Offsets 2, 4, 5 remain. Offset numbers don't change — gaps appear in offset sequence.

Note that compaction is not immediate. The active (newest) log segment is never compacted — compaction only applies to older, closed segments. The log.cleaner.min.cleanable.ratio property controls how dirty (ratio of compacted to uncompacted messages) a log must be before compaction runs.

Tombstones in Compacted Topics

To delete a key from a compacted topic permanently, producers send a message with that key and a null value. This null-value message is called a tombstone. After compaction runs and the tombstone's retention period expires, the key and its tombstone are both removed.

DELETING A KEY WITH TOMBSTONE:

offset: 5  key: user1  value: {name: Alice, status: premium}  ← current
offset: 6  key: user1  value: null  ← TOMBSTONE: delete user1

After compaction, the tombstone is the only record for user1.
After tombstone retention expires (log.cleaner.delete.retention.ms), 
the tombstone itself is removed. user1 is completely gone.

Configuring Retention at the Topic Level

Broker-level retention settings are defaults that apply to all topics. Individual topics can override these defaults with topic-level configuration. This allows different topics to have different retention policies based on their data characteristics.

# Create a topic with custom retention:
bin/kafka-topics.sh \
  --create \
  --topic audit-logs \
  --partitions 6 \
  --replication-factor 3 \
  --config retention.ms=31536000000 \
  --config retention.bytes=-1 \
  --bootstrap-server localhost:9092
  
# 31536000000 ms = 365 days = 1 year retention, unlimited size

# Modify retention on existing topic:
bin/kafka-configs.sh \
  --bootstrap-server localhost:9092 \
  --entity-type topics \
  --entity-name sensor-readings \
  --alter \
  --add-config retention.ms=3600000
  
# 3600000 ms = 1 hour retention (for high-volume, short-lived sensor data)

# Check current topic configuration:
bin/kafka-configs.sh \
  --bootstrap-server localhost:9092 \
  --entity-type topics \
  --entity-name sensor-readings \
  --describe

Retention Policy Recommendations by Use Case

Event Sourcing and Audit Logs

These topics need long or infinite retention because they are the system of record. Every event matters and must be replayable indefinitely.

Topic: financial-transactions
  retention.ms = -1  (infinite — keep forever)
  cleanup.policy = delete  (time-based only, never compact)
  
Topic: audit-trail
  retention.ms = 94608000000  (3 years)
  retention.bytes = -1  (unlimited size)

Real-Time Analytics

Analytics topics often process streaming data and don't need long history. A few hours to a few days is sufficient.

Topic: page-views
  retention.ms = 86400000  (24 hours)
  retention.bytes = 10737418240  (10 GB per partition)
  
Topic: user-sessions
  retention.ms = 7200000  (2 hours — sessions don't last longer)

Database Change Events (CDC)

Change Data Capture topics work well with compaction — each record's key is the database row's primary key, and only the latest version of each row matters.

Topic: db.customers (CDC from customer table)
  cleanup.policy = compact  (keep latest per customer ID)
  min.cleanable.dirty.ratio = 0.1  (compact aggressively)
  delete.retention.ms = 86400000  (keep tombstones 24h)

Log Aggregation

Application logs are high volume and only useful for a short diagnostic window. Short retention plus size limits prevent disk exhaustion.

Topic: app-logs
  retention.ms = 259200000   (3 days)
  retention.bytes = 5368709120  (5 GB per partition)
  segment.bytes = 536870912   (500 MB segments for faster cleanup)

The Log Cleanup Process

Kafka runs a background cleanup thread that periodically scans partition segments and removes those that exceed retention limits. The cleanup does not happen instantly when a retention limit is breached — it runs on a schedule controlled by log.retention.check.interval.ms (default 5 minutes).

This means you might briefly exceed your retention limits between cleanup runs. This is normal and expected. Size consumption slightly overshoots the limit, then drops when cleanup runs. Budget your disk accordingly — leave headroom above your retention.bytes limit for the cleanup lag.

CLEANUP CYCLE TIMELINE:

t=0:   Partition size = 9.8 GB (retention.bytes = 10 GB)
t=1m:  New messages arrive → size = 10.3 GB  ← EXCEEDS LIMIT
t=5m:  Log cleanup thread runs → deletes oldest segment (1.2 GB)
       Size returns to 9.1 GB ← within limit

Peak overshoot: 0.3 GB above limit for up to 5 minutes.
Disk planning: provision at least 110% of retention.bytes per partition.

Segment Configuration and Its Impact on Retention

Segment size affects how quickly Kafka can apply retention. A 10-partition topic with 1 GB max segment size can only delete data in 1 GB chunks. If you want finer-grained cleanup, reduce segment size.

log.segment.bytes: Maximum size of one log segment (default 1 GB). Smaller values mean smaller cleanup granularity but more files on disk.

log.segment.ms: Maximum time before Kafka rolls a new segment (default 7 days). Even if the segment isn't full, a new segment starts after this time. Shorter values make time-based retention more precise.

SEGMENT SIZE IMPACT ON RETENTION SPEED:

Topic: events  retention.ms = 3600000 (1 hour)

With segment.ms = 3600000 (1 hour segments):
  Each segment contains exactly 1 hour of data.
  After 1 hour, that segment ages out quickly and cleanly.
  Retention is precise: data disappears roughly 1 hour after arrival.

With segment.ms = 604800000 (7-day segments, default):
  Each segment spans 7 days of data.
  A segment can only be deleted when ALL its messages are >1 hour old.
  With 7-day segments, retention effectively becomes ~7 days + 1 hour.
  
For short retention periods, reduce log.segment.ms to match.

Key Points

  • Kafka retains messages after consumption so multiple consumers, replay, and buffering are all possible.
  • Time-based retention (retention.ms) deletes segments older than the configured duration. Default is 7 days.
  • Size-based retention (retention.bytes) deletes oldest segments when a partition exceeds the configured size. Default is unlimited.
  • Both retention types can be active simultaneously — the first limit reached triggers cleanup.
  • Log compaction keeps only the latest message per key instead of deleting by age or size. Used for current-state topics like CDC and user preferences.
  • Topic-level configuration overrides broker defaults, allowing different topics to have tailored retention policies.
  • Reduce log.segment.ms to match your retention period for time-sensitive cleanup. Large segments delay retention effectiveness.

Leave a Comment