Vishal.dev
Back
Backend

Redis Caching That Doesn't Rot Your System From the Inside

April 2, 2026 3 min read
RedisCachingPerformance

Caching is the easiest performance win and the easiest way to introduce bugs that manifest days later in production. Invalidation is the hard part. This post covers the strategies that keep Redis from becoming a liability.

TTL Is Not a Strategy

Setting a blanket 5-minute TTL on every cache key is how most systems start. It works until you have a cache hit serving stale data for 4 minutes and 59 seconds while your users see outdated pricing, broken state, or incorrect counts.

The problem isn't TTL itself — it's treating TTL as the only invalidation mechanism. TTL should be the last line of defense, not the primary invalidation strategy.

Namespaced Key Design

The single most impactful change you can make to a Redis cache is switching from flat keys to namespaced, versioned keys:

// Instead of:
user:42.profile

// Use:
user:v2:42.profile

// Or for collections:
org:abc:v7:team.list

When the profile data model changes, bump the version. Old keys expire naturally via TTL while new requests populate fresh keys. This eliminates the need to flush entire databases during deployments.

Version namespacing also makes debugging easier — you can inspect KEYS user:v2:* to understand what's in cache versus KEYS user:* which mixes versions.

Targeted Invalidation Patterns

Write-Through with Tagged Invalidation

When a record is updated, invalidate or update the cache in the same transaction. Tag each cache entry with related entity types so you can invalidate broadly without guessing:

// On article update:
await redis.del([
  `article:${id}.detail`,
  `article:${id}.comments`,
  `feed:${authorId}.articles`
]);

This is predictable, debuggable, and doesn't require background jobs or delayed expiration.

Cache-aside with Stale-While-Revalidate

For high-traffic endpoints where cache misses hurt, serve stale data while asynchronously refreshing:

async function getCached(key, fetchFn, ttl) {
  const data = await redis.get(key);
  if (data) {
    const parsed = JSON.parse(data);
    // Trigger background refresh if close to expiry
    if (parsed.ttl - Date.now() < ttl * 0.2) {
      refreshCache(key, fetchFn, ttl); // async, no await
    }
    return parsed.value;
  }
  return refreshCache(key, fetchFn, ttl);
}

This pattern absorbs traffic spikes without thundering herd problems — one request triggers the refresh while others continue reading the stale value.

What Not to Cache

Some data should never go through a cache layer:

  • Locks and semaphores: Redis SET NX for distributed locks is fine, but don't cache the lock state — you need atomic reads.
  • Rate limiter counters: Use INCR with EXPIRE directly, not cached snapshots.
  • WebSocket session state: In-memory staleness causes ghost connections and missed messages.

Monitoring Cache Health

Track these metrics to know when your cache is working versus hiding problems:

  1. Hit rate per key pattern. If user:* has a 95% hit rate but search:* has 20%, the search cache is wasting memory.
  2. Eviction rate. Sudden spikes mean your working set exceeds maxmemory. Add memory or change the eviction policy.
  3. Stale serve count. Track how often stale-while-revalidate serves stale data. If it's above 5% of total requests, your TTL is too short or your refresh is too slow.
A cache that nobody monitors is just technical debt with a fast response time. Metrics, alerts, and clean invalidation logic separate caching from cargo-culting.