ML
PostgreSQL

The Query That Quietly Stopped Using Its Index: An Implicit Cast Story

One endpoint went from 4 ms to 1.2 seconds with no code change and no schema change. The index was still there, the query still looked identical, and EXPLAIN insisted on a sequential scan over 40 million rows. The cause was a single implicit type cast that Postgres could not push through the index, introduced by a client library that started sending a parameter as text.

June 29, 20268 min readPostgresPerformance

A lookup endpoint that had been answering in about 4 ms for years started taking 1.2 seconds. No deploy of ours touched that path, the table had the index it always had, and the SQL in our logs read exactly the same as it always did: SELECT * FROM events WHERE account_id = $1 LIMIT 50. The events table had roughly 40 million rows and a perfectly good B-tree on account_id. Everything looked normal except the latency, which had jumped by a factor of three hundred overnight.

EXPLAIN refused to use the index

The first useful thing was to run the real plan, not the query. EXPLAIN (ANALYZE, BUFFERS) showed a Seq Scan on events reading all 40 million rows and filtering them down to a handful. The index on account_id was sitting right there, unused. Postgres is not stupid about this. If it chose a sequential scan over an index it had, then from its point of view the index could not answer the query. The question was why a B-tree on the exact column in the WHERE clause had become useless.

The tell was in the filter line of the plan. It did not read account_id = $1. It read (account_id)::text = $1. Postgres was casting every row's account_id to text before comparing. An index on account_id is an index on the bigint values, not on their text representations, so once the comparison happens in text space the index cannot be used. The planner had no choice but to scan the table, cast each row, and compare.

-- what we wrote
SELECT * FROM events WHERE account_id = $1 LIMIT 50;

-- what the plan actually executed
Seq Scan on events
  Filter: ((account_id)::text = $1)
  Rows Removed by Filter: 39,999,950

Where the cast came from

account_id is a bigint in the schema. The parameter $1 arrived as text. When Postgres has a bigint column and a text parameter, it cannot compare them directly, so it has to make the types match. The direction it picks matters enormously. It chose to cast the column to text rather than the parameter to bigint, and casting the column is exactly the thing that defeats the index.

Why did the parameter suddenly become text? A dependency bump. The database driver had changed how it bound integer parameters that came in as JavaScript strings. Our API received account_id from a JSON body where it had always been a string, and the old driver coerced it to a numeric bind, while the new version passed it straight through as a text bind to be safe about precision. Nothing in our code changed. The wire-level type of one parameter changed, and that was enough to turn a 4 ms index lookup into a full scan.

The knobs that looked like fixes

Several tempting changes do nothing here. Bumping work_mem, adding more shared buffers, or running VACUUM ANALYZE to refresh statistics all leave the plan unchanged, because the problem is not a bad cost estimate. The planner is making the correct decision given a text comparison. You cannot tune your way out of a query that genuinely cannot use the index. Rebuilding or reindexing the B-tree does not help either, because the index is fine. It indexes bigint values, and the query is asking a text question.

The clever-looking fix is a functional index on (account_id::text). That would make the text comparison indexable, and the latency would drop. It is the wrong fix. You would be carrying a second index purely to paper over a type mismatch that should not exist, doubling the write cost on a hot table and hiding the real defect. The right move is to make the comparison happen in bigint space, where the existing index already lives.

The real fix

We made the parameter a bigint before it reached the comparison. The cheapest version is to cast the parameter explicitly in the query, WHERE account_id = $1::bigint, which flips the cast onto the parameter side and leaves the column untouched. The cleaner version is to fix the binding at the source so the driver sends a numeric parameter, which we did by coercing account_id to a number in the application before it ever hit the query layer. Either way the plan went back to an Index Scan using events_account_id_idx, reading a few buffers instead of forty million rows, and latency dropped back to 4 ms.

-- cast the parameter, not the column
SELECT * FROM events WHERE account_id = $1::bigint LIMIT 50;

Index Scan using events_account_id_idx on events
  Index Cond: (account_id = ($1)::bigint)
  Buffers: shared hit=5

The general principle: when a column and a parameter have different types, you want the cast on the side that is not indexed. Casting the parameter is free and keeps the index usable. Casting the column is what poisons it.

Why it hid

This bug class is invisible in every place you normally look. The SQL text is unchanged, so a diff of your queries shows nothing. The schema is unchanged, so a migration review shows nothing. The index still exists, so a "do we have the right indexes" audit passes. The change lives entirely in the wire-level type of one bound parameter, which is not in your code and not in your schema. The only place it shows up is the Filter line of an actual EXPLAIN, and only if you read it closely enough to notice a ::text wrapped around your indexed column. We found it because the latency was loud. A 20% slowdown on a colder path could have hidden for months.

Rules of thumb

  • If an index that obviously matches the WHERE clause is being ignored, read the Filter or Index Cond line of EXPLAIN for a cast wrapped around your column, like (account_id)::text. A cast on the column means the index cannot be used.
  • Postgres resolves a type mismatch by casting one side. When it casts the indexed column, the index dies. You want the cast on the parameter side instead.
  • Force the parameter's type explicitly, $1::bigint, or fix the binding so the driver sends the right wire type. Both keep the comparison in the column's own type space.
  • A functional index on the casted expression "fixes" the symptom while doubling write cost and hiding the real type mismatch. Prefer correcting the type over indexing around it.
  • Driver and ORM upgrades can silently change how parameters are bound. A query whose text never changed can still get a brand new plan after a dependency bump.
  • VACUUM, ANALYZE, work_mem, and reindexing do nothing for a comparison that is structurally unindexable. Fix the types, not the knobs.
SharePostLinkedIn

Reader Discussion

1 replies// weighed in

TopNewestAuthor
Add to the thread
Disagree, agree harder, or share your own experience…
Email instead →markdown okbe kind
  1. Rachel Gold· Staff SREAgrees

    the on-call framing throughout this piece is what makes it land. too many infra articles assume you never get paged. those are written by people who never got paged.

    Jul 02, 2026·3 days later

Worked on something similar? Email ducminhldm@gmail.com — I read every one. The good ones become future posts.

Comments seeded · live discussion via email