ML
Database

Database Indexing: B-Tree, Hash, and When Each Actually Helps

Why a single missing index can turn your 10ms endpoint into a 10-second endpoint — and why adding the wrong one makes writes slower.

July 14, 20259 min readDatabasePerformance

An index is a sorted (or hashed) copy of one or more columns, pointing back to the row. It makes reads faster and writes slower. Every index is a trade-off you sign up for.

B-Tree — the workhorse

The default for every major database. Supports equality (=), range (<, >, BETWEEN), prefix LIKE ('abc%'), and ordered scans. The leaves form a linked list so range scans are cheap.

CREATE INDEX idx_orders_customer_date ON orders (customer_id, created_at);

-- Uses the index:
SELECT * FROM orders WHERE customer_id = 42 AND created_at > NOW() - INTERVAL '7 days';
-- Also uses it (prefix match on first column):
SELECT * FROM orders WHERE customer_id = 42;
-- Does NOT use it:
SELECT * FROM orders WHERE created_at > NOW() - INTERVAL '7 days';

Rule: a composite index can be used for any prefix of its columns, never a suffix alone. Order matters. Put the most-filtered column first.

Hash — equality only, blazing fast

Unordered, no range support. In Postgres, Hash indexes only make sense for =-heavy columns you never range-scan. Most teams never need them — B-Tree on the same column is already ~O(log n) with better versatility.

Covering index (INCLUDE)

An index that contains every column the query needs can return results without touching the table at all. This is the single biggest optimisation for hot read paths.

CREATE INDEX idx_orders_covering
ON orders (customer_id, created_at) INCLUDE (total, status);

Partial index

An index over a subset of rows. Useful when most rows share one value and the "interesting" minority are queried a lot.

CREATE INDEX idx_pending_orders
ON orders (created_at) WHERE status = 'pending';

Costs

  • Every INSERT, UPDATE (on indexed columns), DELETE updates every relevant index.
  • Indexes consume disk and buffer-pool memory.
  • Bloated or unused indexes hurt write throughput and waste RAM.

How to actually decide

  1. Run EXPLAIN ANALYZE on the slow query.
  2. If it's a sequential scan over a large table, you probably need an index.
  3. If the query filters on multiple columns, look at selectivity — the most selective column goes first.
  4. Check pg_stat_user_indexes (or equivalent) periodically and drop indexes with idx_scan = 0.

Adding indexes is the first-year move. Removing indexes is the third-year move.

SharePostLinkedIn

Reader Discussion

9 replies// weighed in

TopNewestAuthor
Add to the thread
Disagree, agree harder, or share your own experience…
Email instead →markdown okbe kind
  1. Highlighted by author
    Victor Petrov· Senior BackendAgrees

    "removing indexes is the third-year move" — saving this. first year you add, second year you tune, third year you realise half of them have been dead weight slowing every write since that one feature got cut.

    Jul 16, 2025·2 days later
  2. Mateus Silva· Backend DevAsks

    Q — SELECT FOR UPDATE SKIP LOCKED as a job queue: still a good fit in 2026 or do you reach for a real broker (sqs, rabbit, etc) past a certain QPS? we run ~400 jobs/sec on PG and it's been chill but I'm nervous

    Jul 20, 2025·6 days later
    • Aya Fujimoto· Database Engineer

      We do 11k jobs/sec on PG with SKIP LOCKED + LISTEN/NOTIFY. The thing that breaks first is your job table bloat — set up partman + auto-vacuum tuning before you scale.

      Jul 21, 2025
    • ML
      Minh LeAuthor

      Plus one. The cliff isn't the QPS — it's the visibility-of-dead-rows pattern. SKIP LOCKED holds up if you're disciplined about cleanup.

      Jul 22, 2025
  3. Greta Schäfer· Senior EngineerAgrees

    isolation levels are the part juniors skip and seniors keep underestimating. read committed feels safe; it's not — phantom reads still exist. the moment money is involved, default to repeatable read and benchmark.

    Jul 21, 2025·1 week later
  4. Aya Fujimoto· Database EngineerPushback

    small note on Postgres covering indexes — INCLUDE columns don't get the same treatment as key columns for HOT updates and dedup. people assume index-only scan == free, but the heap visibility map can still send you back to disk. measure first.

    Jul 18, 2025·4 days later
  5. Dan O'Connor· Eng ManagerStory

    N+1 cost us a P1 incident last Black Friday. ONE endpoint was firing 3,400 queries per cart load. Looked fine in dev (3 carts), looked fine in staging (50 carts), production hit 9k QPS to the DB and cardiac arrest. Add load tests.

    Jul 17, 2025·3 days later
  6. Léa Dubois· SREAsks

    any chance you'd publish these as a PDF collection? would love to print and read offline on flights. screen-fatigue is real.

    Jul 20, 2025·6 days later
  7. Ahmed Rahman· Full StackKind words

    concise + opinionated = my favourite kind of engineering post. so many blogs hedge every claim into mush. give me the spicy take with the receipts. more please.

    Jul 15, 2025·1 day later

Worked on something similar? Email ducminhldm@gmail.com — I read every one. The good ones become future posts.

Comments seeded · live discussion via email