Exactly-Once Semantics in Kafka: Idempotence & Transactions
How producer IDs, sequence numbers, and the transaction coordinator combine to give you exactly-once — and when they don't.
"Exactly-once" in Kafka isn't magic — it's a careful stack of three mechanisms: idempotent producers, transactions, and read-committed consumers. Miss any one of them and you fall back to at-least-once.
1. Idempotent producer
Enabled with enable.idempotence=true. The broker assigns a producer ID (PID) and the client attaches a monotonically increasing sequence number per partition. On retry, the broker sees a duplicate (PID, seq) and silently drops it.
Properties p = new Properties();
p.put("bootstrap.servers", "broker:9092");
p.put("acks", "all");
p.put("enable.idempotence", "true");
p.put("max.in.flight.requests.per.connection", "5");
Idempotence alone gives you exactly-once per partition per producer session. It does not cover a crash-and-restart, and it does not cover writes that span partitions.
2. Transactions
To span partitions (or to survive a restart), wrap writes in a transaction. This requires a stable transactional.id — that ID is how the transaction coordinator "recognises" you after a crash and fences off zombies.
producer.initTransactions();
try {
producer.beginTransaction();
producer.send(new ProducerRecord<>("orders", key, order));
producer.send(new ProducerRecord<>("audit", key, audit));
producer.sendOffsetsToTransaction(offsets, consumerGroupId);
producer.commitTransaction();
} catch (KafkaException e) {
producer.abortTransaction();
}
The sendOffsetsToTransaction call is the piece that makes consume → transform → produce pipelines atomic — consumer offsets are committed inside the same transaction as the output records.
3. Read-committed consumer
Producer side alone is useless if consumers read aborted messages. Set isolation.level=read_committed so consumers skip records from aborted transactions and only advance past the Last Stable Offset (LSO).
Where EOS silently breaks
- Writing to an external system (a database, an HTTP API) — the transaction does not cover that. Use the outbox pattern.
- Mixing transactional and non-transactional writes to the same topic — consumers will see an interleaving that depends on timing.
- Using a fresh
transactional.idon every restart. You lose zombie fencing and lose EOS.
Rule of thumb
If your pipeline is Kafka → Kafka, EOS is cheap and real. If it's Kafka → anything else, design for idempotent consumers and stop calling it exactly-once.