Caching Strategies

Chapter 05 · Caching

A cache stores recent or popular data closer to where it is used. Pick the right read and write strategy and you turn a slow database into a fast service. Pick wrong and you serve stale or lost data.

Why we cache

Memory access is roughly 100x faster than SSD, which is roughly 100x faster than disk. A network call to your database might take 5-10ms. A cache lookup is sub-millisecond. If you can serve a request from cache instead of the database, the request is 10x to 100x faster, and your database does 10x to 100x less work.

Caching is the single highest-leverage performance technique in system design. The interview answer "we'd cache that" is right surprisingly often. The follow-up question is always "how exactly?". That's what this topic covers.

Where caches live

Browser cache. The client itself stores responses.
CDN cache. Covered separately. Edge-of-network.
Reverse proxy cache. NGINX, Varnish.
Application-level cache. Redis, Memcached. The "cache layer" you usually mean.
In-process cache. A hash map in your app. Fastest, smallest, lost on restart.
Database cache. The DB's own buffer pool. Mostly invisible to you.

The five read/write patterns

1. Cache-aside (lazy loading)

The application checks the cache first. If miss, it reads from the database, then populates the cache. Writes go directly to the database; the cache is invalidated.

data = cache.get(key)
if data is None:
    data = db.query(key)
    cache.set(key, data, ttl=300)
return data

Most popular pattern. Simple. Resilient (cache failure does not break reads, just makes them slower). Risk: cache and database can drift if invalidation is missed.

2. Read-through

The cache layer itself reads from the database on a miss. The application talks only to the cache. Simpler app code, but the cache library has to support it (DynamoDB DAX, Hazelcast).

3. Write-through

Every write goes to the cache and the database synchronously. The cache is always fresh. Cost: writes are slower (two systems must succeed).

4. Write-behind (write-back)

Writes go to the cache. The cache asynchronously flushes to the database. Fast writes. Risk: cache crash between write and flush loses data.

5. Write-around

Writes go straight to the database, skipping the cache. Cache is populated only on read. Useful when written data is rarely read soon after (event logs).

Cache-aside: the app handles the dance. Hit returns fast; miss triggers DB read and cache fill.

Pitfalls every cache designer hits

Cache stampede (thundering herd)

A popular cache key expires. Suddenly 10,000 requests miss simultaneously and all hit the database. The DB falls over. Mitigations: lock around cache fills (only one request rebuilds), early refresh (refresh before expiry while still serving stale), or probabilistic expiry.

Cache penetration

Attackers hit your system with random keys that don't exist. Every request misses cache and hits the database. Mitigation: cache the "not found" answer too, with short TTL.

Cache breakdown

One specific hot key expires and floods the DB before refilling. Mitigations: never expire hot keys, refresh in background, or use a write-through pattern for them.

Stale data

The DB has new data; the cache has old. User sees the old. The classic invalidation problem. Two approaches: short TTLs (eventual consistency, simpler), or explicit invalidation on write (consistent, more complex).

The hidden bug You cached a list of 100 user IDs. A user is deleted. The cache still has the old list. A user is added. The cache is missing them. Bulk lists are notoriously hard to keep fresh. Either invalidate aggressively or re-design to fetch IDs from DB and only cache per-ID.

What to actually cache

Look for queries that:

Are read often, written rarely.
Are expensive to compute (joins, aggregations).
Are the same for many users.

Don't cache things that are different per request, change every second, or are cheap to compute. Caching everything is a real anti-pattern.

Reasonable defaults

Data type	TTL	Strategy
User profile	10 min	Cache-aside, invalidate on update
Product catalog	1 hour	Cache-aside
Session	30 min sliding	Write-through
Authentication token	5 min	Cache-aside, never longer than token
Computed feed	1 min	Cache-aside, eventual consistency

Caching is one of those topics where the basic ideas are easy but the operational nuance is everything. Start with cache-aside. Watch for stampedes. Plan invalidation explicitly. Most outage stories involving "the cache" are actually stories of weak invalidation strategies.

← Previous

Content Delivery Networks (CDN)

Cache Eviction Policies