Design Uber / Ride Sharing

Design Uber. Real-time location tracking, driver-rider matching, geospatial indexing, surge pricing, ride lifecycle management. The location problem is the fun one.

Problem

Riders open the app and request a ride. The system finds nearby available drivers, picks one, communicates the request. Track both parties on the map in real time. Handle pickup, trip, dropoff, payment.

Scale

Tens of millions of rides per day. Millions of drivers online at peak. Drivers report location every 4 seconds while online. That is a write-heavy, geospatial problem.

Core data model

Three main entities: User (rider), Driver (with current location), Trip. Trip moves through states: requested → matched → driver arriving → in progress → completed. Each transition is an event.

Location ingest

Every driver client sends a location update every 4 seconds via a persistent connection. With 1 million online drivers, that is 250K writes per second of location data. Going to a row-level update on every ping does not scale. Pattern: write to a fast in-memory store (Redis or memcached), with periodic flushes to a persistent store for history.

Geospatial indexing

The hard problem: given a rider's location, find all drivers within X kilometers. Two approaches.

Geohash

Encode lat/lng into a string where common prefixes mean geographical proximity. Cells are rectangles on the globe. Look up nearby drivers by querying for nearby geohash prefixes. Simple, but cells distort near the poles, and edge cases happen at cell boundaries.

Quadtree / S2 cells

Recursively divide space into cells, four at a time, until each cell has roughly the same number of drivers. Google's S2 library is the gold standard. Uber actually uses a variant called H3 (hexagonal cells) which has nicer properties for movement.

Geospatial Driver Lookup (H3 hexes) Rider Query rider's hex + 6 neighbors. Skip the rest. Hundreds of drivers reduced to a few.

Matching

When a rider requests a ride, the system queries the local hex (and a few neighboring hexes) for available drivers. Filter by ride type, vehicle availability, ETA. Pick the best match — usually closest by ETA, sometimes optimized for global efficiency (drivers in low-density areas may get priority to keep coverage).

The matching service then sends the offer to the driver. They have, say, 15 seconds to accept. If declined or timed out, try the next best driver. This is a synchronous workflow with timeouts and rebid logic.

Trip lifecycle

Once matched, both parties are notified. Their locations stream in real time and the app shows the driver moving on the map. State transitions are recorded as events. Events drive payments, ratings, and analytics downstream.

Surge pricing

A pricing service watches local supply (drivers in hex) vs demand (active requests in hex). When demand outstrips supply, multiplier goes up. Updates every 30 seconds or so. Per-hex granularity gives geographically meaningful surge zones. The model is more complex in production (predicts future demand, smooths transitions) but the core idea is just a ratio.

Things to call out