Caching Strategies
A cache stores recent or popular data closer to where it is used. Pick the right read and write strategy and you turn a slow database into a fast service. Pick wrong and you serve stale or lost data.
Why we cache
Memory access is roughly 100x faster than SSD, which is roughly 100x faster than disk. A network call to your database might take 5-10ms. A cache lookup is sub-millisecond. If you can serve a request from cache instead of the database, the request is 10x to 100x faster, and your database does 10x to 100x less work.
Caching is the single highest-leverage performance technique in system design. The interview answer "we'd cache that" is right surprisingly often. The follow-up question is always "how exactly?". That's what this topic covers.
Where caches live
- Browser cache. The client itself stores responses.
- CDN cache. Covered separately. Edge-of-network.
- Reverse proxy cache. NGINX, Varnish.
- Application-level cache. Redis, Memcached. The "cache layer" you usually mean.
- In-process cache. A hash map in your app. Fastest, smallest, lost on restart.
- Database cache. The DB's own buffer pool. Mostly invisible to you.
The five read/write patterns
1. Cache-aside (lazy loading)
The application checks the cache first. If miss, it reads from the database, then populates the cache. Writes go directly to the database; the cache is invalidated.
data = cache.get(key)
if data is None:
data = db.query(key)
cache.set(key, data, ttl=300)
return data
Most popular pattern. Simple. Resilient (cache failure does not break reads, just makes them slower). Risk: cache and database can drift if invalidation is missed.
2. Read-through
The cache layer itself reads from the database on a miss. The application talks only to the cache. Simpler app code, but the cache library has to support it (DynamoDB DAX, Hazelcast).
3. Write-through
Every write goes to the cache and the database synchronously. The cache is always fresh. Cost: writes are slower (two systems must succeed).
4. Write-behind (write-back)
Writes go to the cache. The cache asynchronously flushes to the database. Fast writes. Risk: cache crash between write and flush loses data.
5. Write-around
Writes go straight to the database, skipping the cache. Cache is populated only on read. Useful when written data is rarely read soon after (event logs).
Pitfalls every cache designer hits
Cache stampede (thundering herd)
A popular cache key expires. Suddenly 10,000 requests miss simultaneously and all hit the database. The DB falls over. Mitigations: lock around cache fills (only one request rebuilds), early refresh (refresh before expiry while still serving stale), or probabilistic expiry.
Cache penetration
Attackers hit your system with random keys that don't exist. Every request misses cache and hits the database. Mitigation: cache the "not found" answer too, with short TTL.
Cache breakdown
One specific hot key expires and floods the DB before refilling. Mitigations: never expire hot keys, refresh in background, or use a write-through pattern for them.
Stale data
The DB has new data; the cache has old. User sees the old. The classic invalidation problem. Two approaches: short TTLs (eventual consistency, simpler), or explicit invalidation on write (consistent, more complex).
What to actually cache
Look for queries that:
- Are read often, written rarely.
- Are expensive to compute (joins, aggregations).
- Are the same for many users.
Don't cache things that are different per request, change every second, or are cheap to compute. Caching everything is a real anti-pattern.
Reasonable defaults
| Data type | TTL | Strategy |
|---|---|---|
| User profile | 10 min | Cache-aside, invalidate on update |
| Product catalog | 1 hour | Cache-aside |
| Session | 30 min sliding | Write-through |
| Authentication token | 5 min | Cache-aside, never longer than token |
| Computed feed | 1 min | Cache-aside, eventual consistency |
Caching is one of those topics where the basic ideas are easy but the operational nuance is everything. Start with cache-aside. Watch for stampedes. Plan invalidation explicitly. Most outage stories involving "the cache" are actually stories of weak invalidation strategies.