WebSockets and Real-Time Communication
When the server needs to push data to the client in real time, plain HTTP gets ugly. WebSockets, Server-Sent Events, and long polling are the three main tools. Each fits different cases.
The push problem
HTTP is request-response. The client asks, the server answers. But what if the server has new information for the client and the client doesn't know to ask? Notifications, chat messages, live scores, stock tickers, collaborative editing. The server needs to push.
There are three ways to fake or do real push. From oldest hack to newest standard: long polling, Server-Sent Events, and WebSockets.
Long polling
The client sends a request. The server holds it open, not responding, until either it has new data or a timeout (say 30s) is reached. The client immediately reconnects. The illusion is that data arrives "instantly". The reality is a steady stream of slow HTTP requests.
Pro: works with any HTTP infrastructure. Proxies and firewalls do not care.
Con: burns server connections. Adds latency on each cycle. Wasteful at scale.
Used as a fallback when WebSockets are blocked.
Server-Sent Events (SSE)
SSE is one-way: the server streams a sequence of text events to the client over a single long-lived HTTP connection. The client uses the browser's native EventSource API.
Pro: simple. Works over plain HTTP. Auto-reconnects. Built into browsers.
Con: server-to-client only. To send something back, the client makes a separate HTTP request.
Great for: live news feeds, stock prices, server-pushed notifications, real-time dashboards.
WebSockets
WebSockets give you a full-duplex, persistent TCP connection between client and server. Either side can send a message at any time. The connection upgrade starts as HTTP and switches to the WebSocket protocol.
Pro: lowest latency. Full duplex. Binary or text frames. Standardized.
Con: stateful. Each connection is a long-lived socket on the server. Load balancing is trickier (sticky sessions or specialized infrastructure). Some corporate proxies block them.
Great for: chat, multiplayer games, collaborative editing (Google Docs style), live trading.
Operational realities of long-lived connections
Real-time systems break differently than request-response systems. Things to plan for:
- Connection limits. Each WebSocket holds a TCP socket and memory. A single server with default tuning runs out at 50-100K connections. You need OS tuning and horizontal scaling.
- Load balancing. Round-robin breaks when each connection is stateful. Use sticky sessions, or a layer 4 load balancer that hashes by client.
- Reconnection logic. Networks drop. Phones go to sleep. Clients must reconnect with backoff and resume from where they left off.
- Backpressure. If the server pushes faster than the client can consume, queues grow forever. Implement flow control.
- Authentication. Tokens used at connect time can expire mid-connection. Plan for refresh.
What about MQTT and similar?
For IoT and mobile, MQTT (over TCP or WebSockets) is purpose-built: tiny payloads, QoS levels, last-will messages. AMQP and STOMP cover similar territory. WebSockets is a transport; MQTT is a higher-level protocol you can run on top.
Most engineers go their whole career without designing a real-time system. The day you have to, knowing the trade-offs ahead of time saves you from picking WebSockets for something a five-second poll could have done. Match the tool to the workload.