Design WhatsApp / Messaging Service

Design WhatsApp. Persistent connections at massive scale, message ordering, delivery guarantees, online presence, end-to-end encryption. The classic real-time messaging problem.

The problem

Build a 1:1 and group messaging service. Messages must be delivered reliably even when recipients are offline. Must support delivery receipts (sent, delivered, read), online presence, group chat up to a few hundred members, and end-to-end encryption.

Scale numbers

WhatsApp does roughly 100 billion messages per day. That is 1.2 million messages per second. 2 billion users, with maybe 500 million online concurrently. Handling 500 million simultaneous connections is the central engineering problem.

The connection layer

HTTP request/response does not work here. Polling for new messages every few seconds wastes battery and adds latency. The right primitive is a persistent connection — websocket or a custom TCP protocol like WhatsApp's old XMPP variant. Each online user holds an open connection to a "chat server". When a message arrives for them, the server pushes it down that connection.

500 million concurrent connections divided across, say, 50K servers means 10K connections per server. Modern OS and language runtimes can handle this comfortably (Erlang and Go both excel). The user-to-server mapping is in a routing table — when you want to send to user 42, look up which server holds 42's connection, forward there.

Messaging System Architecture Alice Chat Server 1 (Alice's connection) websocket Routing Table user → server Chat Server 2 (Bob's connection) Bob Message Store (undelivered queue) Push Service (APNs / FCM) Presence Service (online/last seen) If recipient offline → store-and-forward + push notification when next online

Message flow for a 1:1 chat

  1. Alice sends a message to Chat Server 1 over her websocket.
  2. Server 1 looks up Bob in the routing table. Bob is connected to Server 2.
  3. Server 1 forwards the message to Server 2 over an internal RPC.
  4. Server 2 pushes the message down Bob's websocket.
  5. Bob's app sends an ACK back. Server 1 sends "delivered" status to Alice.
  6. If Bob is offline, the message is stored in the per-user queue and a push notification (APNs/FCM) wakes his phone. When he comes online, the queue is drained.

Message ordering

Messages within a single conversation must arrive in order. Approach: each conversation has a monotonic sequence number generated by the server. The client displays in sequence-number order, so even if messages arrive out of order due to retries, the UI is correct.

Group chats

The naive approach is to fan out the message to N members, doing N delivery operations. This is fine for groups under a few hundred. At Discord/WhatsApp group scale, you batch and parallelize, and may also do server-side fanout where a single message ID is delivered to a group "channel" and members subscribe to it.

End-to-end encryption

WhatsApp uses the Signal Protocol. The key property is that the server never sees the message content. Each pair of users has a shared key derived through Diffie-Hellman, and messages are encrypted client-side. The server only sees ciphertext and routing info. This means features like server-side search are impossible without compromising privacy.

Presence

Presence (online status, last seen) is high-volume and low-value, so do not over-engineer. Each chat server keeps an in-memory set of which of its connections is "online" and broadcasts changes to friends. Use eventual consistency — if your last-seen is off by a minute, nobody dies.