ML
Database

Write Skew: The Anomaly Snapshot Isolation Quietly Allows

Two requests each removed a different admin from a project. Each one first checked that at least one other admin would remain, and each check passed, because at the moment it ran another admin really did still exist. Both committed. The project ended up with zero admins, and our Repeatable Read transactions never noticed a thing.

June 26, 20269 min readDatabasePostgres

We had a rule that every project must keep at least one admin. The code enforced it the obvious way: before removing an admin, count the remaining admins, and refuse if the count would drop to zero. It had a test. It worked in every manual check. Then one day a project ended up with no admins at all, locked so that nobody could even re-add one. Two people had clicked "remove admin" on two different accounts at almost the same instant, and both removals went through.

The transactions were running at Postgres's "Repeatable Read" isolation, which a lot of us reach for when we want a transaction to see a stable snapshot. It did exactly what it promised, and the bug happened anyway. The name for this is write skew, and it is the anomaly that snapshot isolation is specifically unable to stop.

The shape of the bug

Both transactions follow the same three steps: read, decide, write.

-- Transaction A (removing admin #101)       Transaction B (removing admin #102)
BEGIN;                                       BEGIN;
SELECT count(*) FROM members                 SELECT count(*) FROM members
  WHERE project_id = 7 AND role = 'admin';     WHERE project_id = 7 AND role = 'admin';
-- sees 2, so "safe to remove one"           -- also sees 2, so "safe to remove one"
UPDATE members SET role = 'member'           UPDATE members SET role = 'member'
  WHERE id = 101;                              WHERE id = 102;
COMMIT;                                       COMMIT;

Each transaction read the same count of 2, each concluded that removing a single admin was fine, and each removed a different admin. Neither one is wrong on its own. Run them one after the other and the second transaction would read a count of 1 and refuse. Run them concurrently under snapshot isolation and they both read 2, because each sees a snapshot taken before the other committed. The invariant, at least one admin, was true for each transaction in isolation and false once both landed.

Why the isolation level let it through

It helps to be precise about what each level actually prevents. It also helps to know that the SQL standard's level names and what Postgres does under them do not line up perfectly.

  • Read Committed (the Postgres default): each statement sees rows committed before that statement began. Prevents dirty reads. Does nothing for our case, since the two transactions never see each other's uncommitted data anyway.
  • Repeatable Read (Postgres implements this as snapshot isolation): the whole transaction sees one consistent snapshot taken at its start. It prevents dirty reads, non-repeatable reads, and phantoms, and it aborts a transaction that tries to update a row another transaction already updated, the "lost update" case. That last protection is the trap: it only fires on a write-write conflict on the same row.
  • Serializable: the result must be equivalent to running the transactions one at a time in some order. This is the only level that stops write skew.

Our two transactions wrote to different rows, id 101 and id 102. There was no write-write conflict for snapshot isolation to detect, so nothing aborted. The conflict was not in the writes, it was between one transaction's read (the count) and the other's write (removing an admin that count was counting). Snapshot isolation does not track that read-write dependency. Serializable isolation does.

What write skew really is

Write skew is the pattern where two transactions read an overlapping set of rows, each makes a decision based on what it read, and each writes in a way that invalidates the other's decision, without ever writing the same row. The reads overlap; the writes do not. Because the writes do not collide, snapshot isolation sees two independent transactions and lets both commit. The textbook version is two on-call doctors each checking "is at least one other doctor on call?", both seeing yes, and both signing off, leaving nobody on call. Same shape: a SELECT establishes a premise, a concurrent write to a different row destroys it.

Once you have the shape in your head you start seeing it everywhere: booking the last seat, spending against a shared budget, enforcing "no two overlapping reservations", keeping a value unique with a check-then-insert. Any invariant that spans multiple rows and is enforced by reading and then writing is a write skew waiting to happen.

Fixing it

1. Use Serializable isolation. Postgres's Serializable Snapshot Isolation tracks the read-write dependencies between transactions and, when it detects a dangerous cycle, aborts one of them with a serialization failure. The cost is real: you have to wrap the transaction in retry logic, because a perfectly correct transaction can now fail and need to run again.

BEGIN ISOLATION LEVEL SERIALIZABLE;
-- ... the read and the write ...
COMMIT;  -- may raise SQLSTATE 40001 (serialization_failure); catch it and retry the whole tx

2. Materialize the conflict with a lock. If you do not want global serializable, force the two transactions to actually collide on a row. Lock the rows you read so the premise cannot change underneath you:

SELECT id FROM members
  WHERE project_id = 7 AND role = 'admin'
  FOR UPDATE;            -- both transactions now contend on these rows

Now the second transaction blocks until the first commits, then re-reads and sees the real count of 1. Locking the rows you base a decision on, or a single parent row such as the projects row used as a gate, turns an invisible read-write hazard into ordinary, visible write-write contention that the database already handles.

3. Push the invariant into the schema where you can. Some write-skew cases collapse if you let a constraint enforce them. "No two overlapping bookings" is an exclusion constraint (EXCLUDE USING gist). "At most one active row" is a partial unique index. A constraint is checked by the database atomically and cannot be skewed. It will not cover "at least one admin", since you cannot express a minimum-count constraint cheaply, but it covers more cases than people expect, and it is the most robust fix when it applies.

Spotting it before it ships

In review, the smell is a transaction that reads to decide, then writes, where the thing it wrote is part of the set it read. Count the admins then change an admin. Check the balance then insert a charge. Look for overlap then insert a booking. If two copies of that transaction running at the same time could each pass the check and then together break it, you have write skew, and Read Committed or Repeatable Read will not save you. Decide deliberately: serializable plus retries, an explicit FOR UPDATE on the read set, or a real constraint.

Rules of thumb

  • Snapshot isolation, which Postgres calls Repeatable Read, prevents dirty reads, non-repeatable reads, and same-row lost updates. It does not prevent write skew.
  • Write skew happens when transactions read overlapping rows but write different rows, so there is no write-write conflict to detect.
  • The tell is read-to-decide-then-write on a multi-row invariant: "at least one", "no overlap", "sum within budget".
  • Serializable isolation is the general fix, but you must handle the serialization failure and retry.
  • If you stay on a weaker level, lock your read set with SELECT ... FOR UPDATE, or lock a single gate row, so the premise cannot change.
  • When the invariant fits a unique index or exclusion constraint, let the schema enforce it. A constraint cannot be skewed.
SharePostLinkedIn

Reader Discussion

8 replies// weighed in

TopNewestAuthor
Add to the thread
Disagree, agree harder, or share your own experience…
Email instead →markdown okbe kind
  1. Highlighted by author
    Victor Petrov· Senior BackendAgrees

    "removing indexes is the third-year move" — saving this. first year you add, second year you tune, third year you realise half of them have been dead weight slowing every write since that one feature got cut.

    Jun 28, 2026·2 days later
  2. Linh Phạm· Java DeveloperAgrees

    @Version optimistic locking is the 80/20 of CRUD apps. genuinely think JPA should make it default-on with a @NoVersion opt-out. the number of last-write-wins races I've debugged in spring boot apps that didn't set it is too damn high

    Jul 01, 2026·5 days later
  3. Mateus Silva· Backend DevAsks

    Q — SELECT FOR UPDATE SKIP LOCKED as a job queue: still a good fit in 2026 or do you reach for a real broker (sqs, rabbit, etc) past a certain QPS? we run ~400 jobs/sec on PG and it's been chill but I'm nervous

    Jul 02, 2026·6 days later
    • Aya Fujimoto· Database Engineer

      We do 11k jobs/sec on PG with SKIP LOCKED + LISTEN/NOTIFY. The thing that breaks first is your job table bloat — set up partman + auto-vacuum tuning before you scale.

      Jul 03, 2026
    • ML
      Minh LeAuthor

      Plus one. The cliff isn't the QPS — it's the visibility-of-dead-rows pattern. SKIP LOCKED holds up if you're disciplined about cleanup.

      Jul 04, 2026
  4. Greta Schäfer· Senior EngineerAgrees

    isolation levels are the part juniors skip and seniors keep underestimating. read committed feels safe; it's not — phantom reads still exist. the moment money is involved, default to repeatable read and benchmark.

    Jul 03, 2026·1 week later
  5. Aya Fujimoto· Database EngineerPushback

    small note on Postgres covering indexes — INCLUDE columns don't get the same treatment as key columns for HOT updates and dedup. people assume index-only scan == free, but the heap visibility map can still send you back to disk. measure first.

    Jun 30, 2026·4 days later
  6. Rachel Gold· Staff SREAgrees

    the on-call framing throughout this piece is what makes it land. too many infra articles assume you never get paged. those are written by people who never got paged.

    Jun 29, 2026·3 days later

Worked on something similar? Email ducminhldm@gmail.com — I read every one. The good ones become future posts.

Comments seeded · live discussion via email