Kafka Internals: Log Segments, Offsets & The Commit Protocol

A tour of how Kafka persists records — segment files, index files, high watermark, and the acks=all contract.

August 11, 20259 min readKafkaArchitecture

Kafka is often described as a distributed commit log, but the phrase hides a lot of detail. Under the hood, every partition is a directory of append-only segment files paired with two index files. Understanding that layout is the fastest route to reasoning about throughput, retention, and failure modes.

1. Segments on disk

Each partition log is split into segments of a fixed size (default log.segment.bytes=1 GiB). A segment becomes closed once it hits the size limit or the time roll log.roll.ms expires. Closed segments are immutable, which is what lets Kafka push sequential I/O so hard.

/var/kafka/data/orders-0/
├── 00000000000000000000.log
├── 00000000000000000000.index
├── 00000000000000000000.timeindex
├── 00000000000004823104.log
├── 00000000000004823104.index
└── 00000000000004823104.timeindex

The filename is the base offset. The .index maps offset → physical byte position; the .timeindex maps timestamp → offset. Both are sparse (one entry every log.index.interval.bytes, default 4 KiB) — a consumer seeking a specific offset does a binary search in the index, then a short linear scan of the log.

2. LEO, HW, and the ISR

Every replica tracks a Log End Offset (LEO). The leader tracks the minimum LEO across the in-sync replicas (ISR) and publishes that as the High Watermark (HW). Consumers can only read up to HW — this is how Kafka guarantees that data returned to a consumer will survive a leader failure.

Leader  LEO=120  HW=118
Follower A LEO=120
Follower B LEO=118  ← gates the HW

3. The acks contract

acks=0 — fire and forget. The producer doesn't even wait for a socket write to complete.
acks=1 — leader appended to its local log. You lose data if the leader dies before replication.
acks=all — the leader waits for all ISR followers to append. Combined with min.insync.replicas=2 this is the correct setting for durable pipelines.

4. Retention is a compaction, not a delete

When retention expires, Kafka deletes whole closed segments. That's why setting log.retention.ms to a small value but log.segment.ms to a huge one will appear to "not work" — the segment simply hasn't rolled yet.

Rules of thumb

Keep segments small on low-traffic topics so retention kicks in on time.
On throughput-critical topics, keep segments large to minimise file-handle churn.
Always pair acks=all with min.insync.replicas. One without the other is a footgun.

Kafka Internals: Log Segments, Offsets & The Commit Protocol

1. Segments on disk

2. LEO, HW, and the ISR

3. The acks contract

4. Retention is a compaction, not a delete

Rules of thumb

9 replies// weighed in

More from this topic

Exactly-Once Semantics in Kafka: Idempotence & Transactions

Consumer Group Rebalancing: Eager, Cooperative, and Static

Partitioning Strategy: Keys, Hot Partitions, and Ordering