Circuit Breaker Pattern

When a downstream service is failing, hammering it harder makes things worse. A circuit breaker stops calls from going through, gives the failing service time to recover, and protects your own system from cascading failure.

The cascading failure problem

Service A calls service B. B is slow. A's threads start piling up waiting on B. A runs out of threads. A starts failing too. C calls A; same thing happens to C. Within minutes, half the system is down because of one slow service.

The circuit breaker cuts the chain. When A notices B is failing, A stops calling B for a while. A's threads stay free. A degrades gracefully (returns a default, errors quickly, falls back). B gets breathing room.

The three states

  1. Closed. Normal operation. Calls pass through. The breaker counts failures.
  2. Open. Failure threshold exceeded. Calls fail immediately without trying. Breaker waits.
  3. Half-open. After a timeout, the breaker lets a few calls through to test the waters. If they succeed, go back to closed. If they fail, back to open.
CIRCUIT BREAKER STATES CLOSED calls pass OPEN calls fail fast HALF-OPEN test calls N failures timeout test calls succeed
The circuit breaker oscillates between closed (normal), open (failing fast), and half-open (testing).
Try it: drive the breaker through its three states
Toggle the downstream's health and send calls. Watch the breaker count failures, open, time out, and probe in half-open.
StateCLOSED Failures0

Tuning

Pair with fallbacks

When the breaker is open, what does your service return? Three options:

Hystrix and successors Netflix's Hystrix popularized the circuit breaker. It is now in maintenance mode. Modern alternatives: resilience4j (Java), Polly (.NET), Failsafe (Java). Service meshes like Istio do it at the proxy level.

Bulkhead pattern (worth pairing)

Isolate resources so one slow dependency can't drain everything. Allocate separate thread pools or connection pools per downstream. If service B exhausts its pool, A's pool to service C is unaffected. Often deployed alongside circuit breakers.

Circuit breakers prevent cascading failure. They're not magic; they're an explicit choice to fail fast when something is sick. Combined with retries, timeouts, and bulkheads, they keep your system robust under partial failure.