SQL vs NoSQL

Chapter 03 · Databases

SQL gives you structure, joins, and ACID. NoSQL gives you flexibility, scale, and schema freedom. You pick by access pattern and growth profile, not by religion.

The decision that shapes everything

The database is the heart of your system. Picking the wrong one early is a mistake you pay for daily. The first decision is rarely "which product" but "which family". SQL or NoSQL? Even within NoSQL, which kind?

SQL (relational) databases

Relational databases organize data into tables with strict schemas. Rows are linked by foreign keys. SQL (Structured Query Language) lets you join, filter, aggregate, and transact across them. Postgres, MySQL, Oracle, SQL Server are the headliners. They have been the default for forty years for good reason.

Strengths:

ACID transactions across multiple tables.
Powerful querying. JOINs, GROUP BY, window functions, full-text search.
Strict schemas catch errors at write time, not query time.
Mature tooling. Decades of indexes, optimizers, monitoring.

Weaknesses:

Hard to scale horizontally. Sharding a relational DB is doable but painful.
Schema changes (migrations) on huge tables are operationally tricky.
Some workloads (graphs, time series, blob storage) fit poorly.

NoSQL: four distinct families

"NoSQL" is a marketing term that hides four very different things. Confusing them is common.

Five database families to know. SQL is still the default for most workloads.

Key-value stores

Store: a key maps to a blob of data. Get by key, set by key. That's it. Redis, Memcached (in memory), DynamoDB (durable). Built for fast lookups at huge scale. Bad at anything that requires querying by value.

Document stores

Store: each record is a JSON-like document. Schema is flexible per document. MongoDB, CouchDB, Firestore. Good when records have nested structure or differ in shape (e.g., a product catalog where each category has different attributes).

Wide-column stores

Store: each row can have wildly different columns. Optimized for write-heavy workloads at planet scale. Cassandra, HBase, Bigtable. Used by Netflix, Apple, big telcos. Great for time series, event logs, sensor data. Cumbersome for ad-hoc queries.

Graph databases

Store: nodes and edges. Querying friend-of-friend relationships, fraud rings, dependency chains. Neo4j is the reference. Niche but unbeatable for the right problem.

The honest selection guide

Need	Pick
You're building a SaaS app and don't know yet	Postgres. Default. You will not regret it.
You need ACID across multiple records	SQL.
Cache or session store	Redis.
Hierarchical or flexible product data	MongoDB or Postgres with JSONB.
Billions of writes per day, time-ordered	Cassandra.
Social graphs or recommendations	Graph DB.
Massive read fanout, eventually consistent OK	DynamoDB or Cassandra.
Search across text	Elasticsearch (alongside the primary DB).

The boring advice Most teams should start with Postgres and stay there for a long time. It handles JSON now (JSONB), full-text search, geospatial, even time-series with extensions. Add NoSQL when you have a clear, measured access pattern that Postgres can't serve well, not because NoSQL is in the news.

The "polyglot persistence" reality

Mature systems use multiple databases. The orders table in Postgres for ACID. Redis for sessions and rate limit counters. Elasticsearch for search. Cassandra for click events. Each tool for its job. This adds operational complexity, but it stops happening when you try to force one DB to do everything.

Pick by access pattern (how will you query?), scale (how big will it grow?), and consistency need (does staleness hurt?). Not by what's trendy. The boring database is usually the right one.

← Previous

WebSockets and Real-Time Communication

Database Indexing