Design a Twitter-like News Feed
Design a Twitter-style news feed. The fanout-on-write vs fanout-on-read trade-off is the heart of every social media system. Get this right and you understand how feeds at scale actually work.
The problem
Users post tweets. Users follow other users. When a user opens the app, show them a reverse-chronological feed of tweets from people they follow. Stretch: ranked feed, retweets, media, real-time updates.
Numbers to reason about
Assume 300 million daily active users, average 200 follows each, 100 million tweets per day, average user reads their feed 5 times a day. That is 1.5 billion feed reads daily, around 17 thousand per second. The write volume is much smaller (1000 tweets per second), but the read amplification is massive — every tweet might be shown to millions of followers.
The two strategies
Fanout on read (pull model)
When a user opens their feed, query "give me the latest tweets from the 200 people they follow", merge, sort, return. Simple and storage-efficient. Each tweet stored once. The downside is feed read is expensive — you are doing 200 queries on hot data and merging on the fly. At 17K reads per second, this kills your database.
Fanout on write (push model)
When a user posts, copy the tweet ID to the feed inboxes of every follower. Reading the feed is now just "give me the last 100 entries from my inbox", which is one fast query. Writes get expensive, especially for users with millions of followers. Storage explodes — a tweet from someone with 10 million followers takes 10 million inbox writes.
The hybrid approach (what Twitter actually did)
Use fanout on write for normal users (most people have under 1000 followers, fanout is cheap). For celebrities (millions of followers), use fanout on read instead. When a normal user reads their feed, they get most of it from their inbox plus a separate query against the small set of celebrities they follow, then merge.
The threshold is tunable. Twitter's old number was around 10K followers. Below that, fanout. Above, pull. This avoids the worst case of either model.
Architecture
Tweet service stores tweets in a sharded data store, sharded by user_id. User graph service stores follow relationships. Fanout service is a queue consumer that, on every tweet by a non-celebrity, pushes the tweet ID to each follower's inbox. Inbox is a per-user list capped at 800 entries (the typical max feed depth a user scrolls), stored in Redis or a similar fast KV store. Feed service reads inbox + celebrity tweets and assembles the response.
Other interesting bits
- Feed size cap: Cap inboxes at a few hundred. Older tweets fall off; if a user wants ancient history, do a fallback query.
- Real-time updates: Use websockets or long-polling. New tweet from someone you follow pushes a notification or auto-prepends to the feed.
- Ranking: Reverse-chronological is easy. Engagement-ranked feeds (Twitter's "Home" tab) require a ML scoring service that re-ranks each pull. That is a whole other system.
- Media: Images and videos go to object storage (S3) with a CDN in front. Tweets only store the URL.