The Saga Pattern: Distributed Transactions Without Two-Phase Commit
When a business operation spans several services, you can't wrap it in one ACID transaction. Sagas trade atomicity for compensations — here's how to get them right.
The moment a single business operation touches two databases owned by two services, the comfortable world of a single ACID transaction is gone. You can't BEGIN on one Postgres and COMMIT across another service's MySQL. Two-phase commit (2PC) promises to fix that, but in practice it couples availability to your slowest participant and locks rows for the duration of a network round trip. The saga pattern is the answer most teams actually reach for.
1. A saga is a sequence of local transactions
Instead of one global transaction, a saga is a series of local transactions, each in one service. Every step has a matching compensating action that semantically undoes it. If step 4 fails, you run the compensations for steps 3, 2, and 1 in reverse.
Order saga:
1. createOrder() ⟲ cancelOrder()
2. reserveInventory() ⟲ releaseInventory()
3. chargePayment() ⟲ refundPayment()
4. scheduleShipping() ⟲ cancelShipping()
Note "semantically undoes" — you can't roll back a charge, you refund it. Compensations are forward actions, not magic rewinds.
2. Orchestration vs choreography
There are two ways to wire the steps together.
- Choreography — each service listens for an event and emits the next one. No central brain. Simple for 2–3 steps, but the flow is implicit and hard to follow once it grows.
- Orchestration — a coordinator (the saga orchestrator) tells each service what to do and tracks state. The flow lives in one place you can read, test, and visualise. Preferred once the saga has branches or more than a few steps.
// Orchestrator state machine
ORDER_CREATED → INVENTORY_RESERVED → PAYMENT_CHARGED → DONE
↓ fail ↓ fail
(compensate order) (release inventory, cancel order)
3. Compensations must be idempotent
The orchestrator will retry on timeouts, so refundPayment() may be called twice. Key every compensation by saga ID + step so the second call is a no-op. This is the same discipline as idempotency keys for APIs.
4. The pivot transaction
Order your steps so the irreversible ones come last, after a pivot — the step after which the saga is guaranteed to complete. Reservations and validations (cheap to compensate) go before the pivot; sending an email or shipping (hard to undo) goes after, once success is certain.
5. What sagas don't give you
Sagas are not isolated. Between steps, other transactions can see intermediate state — an order exists before payment clears. You handle this with semantic locks (a PENDING status), commutative updates, or by re-reading and validating in each step. If you need true isolation, you need a different design, not a saga.
Rules of thumb
- Reach for orchestration once you have branching or 4+ steps. Choreography rots into spaghetti.
- Every step needs a compensation, and every compensation must be idempotent.
- Put irreversible steps after the pivot, never before.
- Persist saga state in a real table — an in-memory orchestrator loses every in-flight saga on restart.