Concept
Why Partition at All?
Why Partition at All?
A single database server has hard ceilings: finite CPU, RAM, disk I/O, and connection slots. Once your working set no longer fits in memory, or writes saturate a single disk, or you exhaust connections, vertical scaling (a bigger box) stops being economical. Partitioning splits one large dataset into smaller pieces so each piece fits comfortably on its own node.
Partitioning vs Sharding vs Replication
- Partitioning — the general idea of splitting data into subsets called partitions (or shards).
- Sharding — horizontal partitioning across separate machines. Each shard is an independent database holding a disjoint subset of rows.
- Replication — keeping copies of the same data on multiple nodes (for availability / read scaling). Orthogonal to sharding — production systems do both: each shard is itself replicated.
Vertical vs Horizontal Partitioning
- Vertical partitioning splits columns — e.g. move rarely-used
bioandavatar_blobcolumns into a separate table so the hot columns stay compact. - Horizontal partitioning (sharding) splits rows — e.g. users 1–300 on one node, 301–600 on another. This is what we mean by sharding for scale.