ML
Redis

Cache Stampede: The Four Patterns That Actually Work

Everyone has been paged by a cache miss storm. Here are the four battle-tested patterns for preventing one.

September 01, 20258 min readRedisPerformance

Stampede: a hot key expires, a thousand requests all miss at once, they all hit the database, the database falls over, the cache fills up again — until it expires. This is a recurring, very expensive outage pattern. Pick one of the four patterns below before it happens.

1. Mutex (single-flight)

On a miss, acquire a short-lived lock on a sibling key and have exactly one request recompute. Others either wait or serve a stale value.

SET lock:user:42 myuuid NX PX 3000
# if OK -> I am the recomputer; others retry in 50ms

2. Probabilistic early expiration (XFetch)

Each request checks the remaining TTL. If TTL is short relative to the compute cost, refresh probabilistically. Over many requests, exactly one of them will refresh early and everyone else still hits the cache.

const shouldRefresh =
  now - deltaEstimate * Math.log(Math.random()) >= expiresAt;

This is the Vatto/Aulbach "XFetch" algorithm. It's stateless and doesn't need a lock.

3. Stale-while-revalidate

Store two TTLs: a soft TTL (serve stale past this point) and a hard TTL (actually evict). On a soft miss, serve the stale value and kick off a background refresh.

4. Pre-warmed hot keys

For a handful of known-hot keys, don't let them expire at all. A cron job overwrites them on a schedule. You give up eventual-consistency nuance for zero-miss latency — a reasonable trade for the top 50 keys.

Rule of thumb

  • Expensive compute, many readers → mutex.
  • Cheap compute, very many readers → XFetch.
  • You can tolerate stale briefly → stale-while-revalidate.
  • You know the hot set in advance → pre-warm.

Combine patterns — they're not exclusive. The worst option is none.

SharePostLinkedIn

Reader Discussion

8 replies// weighed in

TopNewestAuthor
Add to the thread
Disagree, agree harder, or share your own experience…
Email instead →markdown okbe kind
  1. Highlighted by author
    Elena Ricci· Platform Eng · Booking infraFrom experience

    XFetch quietly killed our daily cache stampede. 6h TTL on a product catalog, three-instance API, used to brown-out for 90 seconds every refresh. Shipped XFetch on a Friday afternoon and forgot it existed. That's the highest praise I can give a fix.

    Sep 03, 2025·2 days later
  2. Luca Bianchi· Tech LeadPushback

    fwiw — hash tags bị overuse là footgun thật. Cluster bọn mình từng có 1 slot ăn 41% traffic vì ai đó nghĩ {tenant} làm key prefix là idea hay. Cluster slowlog từ 12ms lên 800ms trong 1 đêm. Cluster rebalance không cứu được vì cùng 1 slot.

    Sep 07, 2025·6 days later·edited
  3. Huyền Lê· Software EngineerAgrees

    viết postmortem tiêu đề 'WAIT did not wait' xong 1 tuần sau gặp đoạn này trong post. cười ra nước mắt. cái phần WAIT không phải consensus primitive cần tô đỏ trong docs official.

    Sep 04, 2025·3 days later
  4. Amir Shah· InfraAsks

    Q: pre-warm hot keys — internal cron inside the app vs external scheduler (k8s cronjob etc)? We've shipped both. Internal is simpler but you fight clock skew across replicas; external is reliable but adds a moving piece.

    Sep 05, 2025·4 days later
    • ML
      Minh LeAuthor

      External, every time. The number of "why is the warmup not running" tickets I've seen with internal crons is not funny anymore. Make it boring infra.

      Sep 06, 2025
    • Carla Pérez· Backend

      we do external + a redis lock so only one instance actually runs the warmup. simple and observable.

      Sep 07, 2025
  5. Mark Vandermeer· Infra EngineerPushback

    RDB + AOF on the same instance is not a 'belt and suspenders' move btw — fsync-on-rewrite collisions can make latency vibrate. Pick one and tune it.

    Sep 09, 2025·1 week later
  6. Léa Dubois· SREAsks

    any chance you'd publish these as a PDF collection? would love to print and read offline on flights. screen-fatigue is real.

    Sep 07, 2025·6 days later

Worked on something similar? Email ducminhldm@gmail.com — I read every one. The good ones become future posts.

Comments seeded · live discussion via email