Message Queues

Chapter 06 · Async and Messaging

A message queue lets one service send work to another without waiting for it. The producer drops a message in the queue and moves on. A consumer picks it up later. The whole system gets more resilient and more scalable.

The blocking problem

Imagine a sign-up flow that does five things: create the user, send a welcome email, create a default workspace, send analytics events, warm up some caches. If you do all five synchronously inside the HTTP request, the user waits for all of them. Worse, if the email service is down, sign-up fails entirely.

The fix: do only the essential work inside the request and queue the rest. The user gets a fast response. The other work happens when it can.

What a queue gives you

Decoupling. Producer and consumer don't need to be running at the same time, on the same machine, or even at the same speed.
Buffering. Traffic spikes get absorbed by the queue. Consumers process at their own pace.
Retries. If a consumer crashes, the message stays in the queue and is redelivered.
Workload spreading. Multiple consumers pull from the same queue, parallelizing work.

One producer, one queue, many workers. The classic shape.

Delivery semantics

Three guarantees, ranked by difficulty:

At most once. Message may be lost but never duplicated. Easy. Used when occasional drops are fine.
At least once. Message is delivered, possibly multiple times. The default for most queues. Requires consumers to be idempotent.
Exactly once. The holy grail. Hard in distributed systems. Often achieved through at-least-once + idempotency rather than truly once.

Common gotchas

Poison messages. A bad message that crashes every worker. Without a dead-letter queue, you process it forever. Always configure a DLQ.

Ordering. Most queues do not guarantee order across consumers. If order matters per-key, partition by key (Kafka does this naturally; SQS has FIFO queues for it).

Backpressure. If producers outpace consumers forever, the queue grows without bound. Set max length, alert on lag, scale consumers.

Tools to know RabbitMQ for traditional routing-rich messaging. AWS SQS for the simplest hosted queue you can imagine. Redis Streams for lightweight cases. Kafka for high-throughput streams (covered next). NATS for fast pub/sub.

Queues turn brittle synchronous flows into resilient asynchronous ones. The price: harder reasoning. Trace IDs become essential. Idempotency becomes essential. The investment usually pays off the first time a downstream service has a bad day.

← Previous

Distributed Caching

Pub/Sub and Kafka