Pub/Sub and Kafka
Pub/sub broadcasts a message to many interested consumers. Kafka takes that idea, makes it durable, partitioned, and replayable. It is the backbone of modern data pipelines.
Queue vs pub/sub
A queue is one-to-one. A message put in by a producer is consumed by exactly one worker. Pub/sub is one-to-many. A message published to a topic is delivered to every subscriber.
Pub/sub fits when many parts of your system care about the same event. A user signs up; the analytics service wants to know, the email service wants to know, the recommendation service wants to know. With pub/sub each one subscribes; the producer doesn't even know they exist.
Kafka: pub/sub on steroids
Kafka is the de facto pub/sub system at scale. It has three core ideas:
- Topic. A named log of messages. Producers append; consumers read.
- Partition. Each topic is split into partitions. Different partitions live on different brokers, enabling horizontal scale.
- Offset. Each consumer tracks where it is in each partition. Messages aren't deleted on read; they expire by time or size.
The append-only log mental model
Kafka is not a queue, despite being used like one. It is a distributed, replicated, append-only log. Producers add to the end. Consumers read from any position they like. Messages stick around for the retention period (often 7 days), so multiple consumers can read the same data at different speeds, even replay history.
Consumer groups
A consumer group is a set of consumers that cooperatively read a topic. Kafka assigns each partition to one consumer in the group. To scale, add more consumers (up to the partition count). Multiple consumer groups can independently read the same topic without affecting each other.
Why people pick Kafka
- Massive throughput (hundreds of MB/sec per broker).
- Durable, replicated, replayable.
- Excellent for event sourcing, audit logs, analytics pipelines, change data capture.
- Rich ecosystem (Connect, Streams, KSQL).
Why people regret picking Kafka
- Operationally complex. Brokers, ZooKeeper or KRaft, partitions, replication factor.
- Overkill for low-volume work. RabbitMQ or SQS are easier.
- Rebalancing storms during deploys can stall consumers.
Alternatives
AWS Kinesis, Google Pub/Sub, Apache Pulsar, Redpanda. Each has its angle. Pulsar separates compute from storage. Redpanda is Kafka-API-compatible without ZooKeeper. Kinesis is hosted simplicity. Pick by operational fit.