Service Discovery

When services come and go (scaling up, scaling down, redeploying), how does service A find a healthy instance of service B? Service discovery is the mechanism that answers that question dynamically.

Why dynamic discovery

In a static world, you hardcode service B's address in service A's config. Easy. In a dynamic world (containers, autoscaling, rolling deploys), service B's instances change addresses constantly. Hardcoding doesn't work.

Service discovery solves this. Each service registers itself with a registry on startup. Other services query the registry to find healthy instances. As things come and go, the registry stays current.

Two patterns

Client-side discovery. The client queries the registry, gets a list of instances, picks one (with its own load balancing), and sends the request directly. Examples: Eureka with Ribbon. Pro: no extra hop. Con: every client needs the registry library.

Server-side discovery. The client sends to a fixed endpoint (a load balancer or proxy). The proxy queries the registry and forwards. Examples: Kubernetes Services, AWS ELB with target groups. Pro: clients are simpler. Con: extra hop.

How instances register

How instances are removed

Two flavors of failure detection:

Most production systems use both. Heartbeats for liveness, health checks for readiness.

Service A REGISTRY Consul / etcd B-1 ✓ B-2 ✓ B-3 ✗ register/heartbeat where is B? use B-1 or B-2
Instances register with a central registry. Clients query the registry to find healthy instances.

The tools

If you're on Kubernetes You already have service discovery. Use the built-in DNS. Don't add Consul or Eureka unless you have a specific reason. Layering registries causes more bugs than it solves.

Service mesh

A service mesh (Istio, Linkerd) is a deeper layer that handles discovery, routing, security, and observability across all service-to-service traffic. A sidecar proxy runs alongside each service. Powerful for complex environments. Heavy for small ones. Adopt when complexity warrants it.