HOT Updates and fillfactor: The Postgres Knob That Halved My Write Amplification
An UPDATE-heavy table that's barely touched by readers was somehow generating index bloat and burning IO. The fix wasn't more indexes or a faster disk. It was understanding HOT updates and one storage parameter almost nobody sets: fillfactor.
I had a table that did one thing: a row per active session, with a last_seen column we bumped on every request. A few hundred writes a second, almost no reads, six columns, three indexes. It should have been boring. Instead it was the noisiest thing in the database: constant autovacuum, indexes growing faster than the table, and write IO that made no sense for how little data was actually changing. The column we updated wasn't even indexed. So why were the indexes bloating?
The answer is how Postgres does an UPDATE, and a storage knob called fillfactor that I'd ignored for years.
An UPDATE is secretly an INSERT plus a delete
Because of MVCC, Postgres never modifies a row in place. An UPDATE writes a brand new version of the row (a new tuple) and marks the old one dead. The new tuple needs a physical location on a page somewhere. And here's the part that bites: every index on the table points at the physical location of a row version. A new tuple in a new location means every index has to get a new entry pointing at it, even indexes on columns you didn't touch.
That was my bug. I updated one unindexed column, and all three indexes got a new pointer per update, because the row physically moved. That's the write amplification: one logical change, four physical writes (heap plus three indexes), plus three dead index entries to vacuum later.
HOT: when the new version stays on the same page
Postgres has an optimisation for exactly this: Heap-Only Tuples. The rules for a HOT update are simple and worth memorising:
- You did not change any indexed column, and
- the new row version fits on the same page as the old one.
If both hold, Postgres skips the index entirely. The old tuple gets a pointer to the new tuple within the page (a HOT chain), and index scans follow that pointer at read time. No new index entries, no index bloat, far less IO. My update qualified on the first rule (the column wasn't indexed) but kept failing the second: the page was full, so the new version landed on a different page, and HOT was off the table.
fillfactor: leave room for the new versions
By default Postgres packs data pages to 100% full (fillfactor=100) for tables. Great for a table you only ever INSERT into and read. Terrible for an update-heavy one, because there's no free space on the page for the new tuple to land, so HOT can almost never trigger.
The fix is to tell Postgres to leave slack on each page:
ALTER TABLE sessions SET (fillfactor = 80);
-- fillfactor only applies to pages written AFTER the change,
-- so rewrite the existing table to apply it everywhere:
VACUUM FULL sessions; -- or pg_repack to avoid the exclusive lock
With fillfactor=80, each page is filled to 80% on insert and keeps 20% free. Now when a row is updated, the new version usually fits on the same page, HOT kicks in, and the indexes are left alone. On my session table this took the HOT-update ratio from near zero to about 95% and roughly halved write IO. The indexes stopped growing.
How to see if it's actually working
pg_stat_user_tables tracks it directly. Compare total updates against HOT updates:
SELECT relname,
n_tup_upd,
n_tup_hot_upd,
round(100.0 * n_tup_hot_upd / nullif(n_tup_upd,0), 1) AS hot_pct
FROM pg_stat_user_tables
WHERE relname = 'sessions';
If hot_pct is low on a table you update a lot, you're paying the full index-write tax on every change. That's your signal to look at fillfactor, or at whether you're updating an indexed column you didn't need to index.
The cost, because there's always one
Lower fillfactor means the table takes more disk and more pages, so sequential scans read more, and a read-heavy table loses a little cache efficiency. That's why the default is 100. Don't lower it on append-mostly tables; you'd just waste space for nothing. The win is specific to tables with a high update rate where the updates don't touch indexed columns. Match the knob to the workload.
The other half: stop indexing the column you update
fillfactor only helps if the update qualifies for HOT in the first place, and the first rule is "no indexed column changed". So the cheapest fix of all is sometimes to drop the index on the hot column. I had a half-useless index on last_seen that one rarely-run query used. Dropping it meant every last_seen bump became HOT-eligible, and that query got a sequential scan it could afford. One fewer index, one fewer thing bloating, and HOT could finally fire.
Rules of thumb
- An
UPDATEwrites a new tuple and, unless it's HOT, adds an entry to every index, even indexes on columns you didn't change. That's where the surprise bloat comes from. - HOT fires only when no indexed column changed and the new version fits on the same page. Both rules, every time.
- Default
fillfactor=100leaves no room for the new version, so update-heavy tables almost never get HOT. Set it to 70–90 on those tables. - fillfactor only affects pages written after you set it; rewrite the table (
VACUUM FULLorpg_repack) to apply it to existing data. - Watch
n_tup_hot_updvsn_tup_updinpg_stat_user_tables. Low HOT ratio on a hot table = wasted IO. - Don't index a frequently-updated column unless you truly need the index. Each such index is one more thing blocking HOT.