SD Data Partitioning and Sharding
As a database grows to billions of records, a single machine cannot handle the data volume, query load, or storage requirements alone. Data partitioning is the technique of splitting a large dataset into smaller, manageable pieces distributed across multiple machines. Sharding is a specific form of horizontal partitioning where data spreads across multiple database servers (shards), each holding a subset of the total data.
Think of a phone book. A single phone book for an entire country is enormous and slow to search. Divide it into 26 books (A through Z) and finding a name becomes much faster — go directly to the right book. Each book is a shard.
Why Partitioning Is Necessary
Single Database Server (No Partitioning): - Storage limit: ~50 TB (physical disk limit) - Query limit: ~10,000 queries/second - Write limit: ~5,000 writes/second After sharding across 10 servers: - Storage limit: ~500 TB (10 × 50 TB) - Query limit: ~100,000 queries/second - Write limit: ~50,000 writes/second
Types of Partitioning
Horizontal Partitioning (Sharding)
Different rows go to different servers. Each shard has the same columns (same schema) but holds different subsets of rows.
Users Table (1 billion rows, split into 4 shards): Shard 1: User ID 1 to 250,000,000 Shard 2: User ID 250M+1 to 500,000,000 Shard 3: User ID 500M+1 to 750,000,000 Shard 4: User ID 750M+1 to 1,000,000,000 Query for User 650M → Go to Shard 3 directly
Vertical Partitioning
Different columns go to different servers (or different tables). Related columns that are frequently queried together stay together, while rarely used columns move elsewhere.
Original Users Table (many columns): | UserID | Name | Email | Password | Bio | ProfilePhoto | LastLogin | Settings | After Vertical Partitioning: Core table (queried every login): | UserID | Name | Email | Password | LastLogin | Extended table (queried rarely): | UserID | Bio | ProfilePhoto | Settings |
Functional Partitioning
Different types of data go to different databases entirely, based on the business function they serve. This is the "database per service" pattern from microservices.
User data → User Database (PostgreSQL) Product data → Product Database (MongoDB) Sessions → Session Store (Redis) Orders → Order Database (MySQL)
Sharding Strategies
1. Range-Based Sharding
Data divides based on a range of values in the shard key. Records with values in a specific range go to a specific shard.
Shard Key: User ID Shard A: UserID 1 - 1,000,000 Shard B: UserID 1,000,001 - 2,000,000 Shard C: UserID 2,000,001 - 3,000,000 Lookup User 1,500,000 → Shard B (fast, direct)
Advantage: Range queries are efficient (get all users with IDs between 1M and 2M = one shard).
Disadvantage: Hotspot problem — new users always land on the latest shard, overloading it while older shards sit idle.
2. Hash-Based Sharding
A hash function determines which shard a record belongs to. The hash distributes records evenly and unpredictably across all shards.
Shard Key: User ID Number of Shards: 4 Hash Function: shard = UserID % 4 User 100: 100 % 4 = 0 → Shard 0 User 101: 101 % 4 = 1 → Shard 1 User 102: 102 % 4 = 2 → Shard 2 User 103: 103 % 4 = 3 → Shard 3 User 104: 104 % 4 = 0 → Shard 0 (cycles back)
Advantage: Even distribution, no hotspots.
Disadvantage: Range queries are inefficient — finding users with IDs 100-200 requires checking all shards. Also, adding a new shard requires resharding all existing data.
3. Consistent Hashing
Consistent hashing solves the resharding problem. Data and servers are mapped onto a virtual ring. Each data item belongs to the nearest server clockwise on the ring. Adding or removing a server only requires remapping a small portion of the data.
Hash Ring (0 to 360 degrees):
Server A (at 90°)
↑
Server D ← Ring → Server B
(at 270°) (at 180°)
↓
Server C (at 270°)
User data at hash position 120° → belongs to Server B (next clockwise)
User data at hash position 60° → belongs to Server A (next clockwise)
Adding Server E at 150°:
Only data between 120° and 150° moves from Server B to Server E
All other data stays unchanged!
Used by: Amazon DynamoDB, Apache Cassandra, Redis Cluster
4. Directory-Based Sharding
A lookup table (the directory) records which shard holds each piece of data. Every read or write first consults the directory.
Lookup Table: +------------+--------+ | UserID | Shard | +------------+--------+ | 1-500 | Shard1 | | 501-1000 | Shard2 | | Special | Shard3 | +------------+--------+ Query User 750 → Check lookup table → "Shard2" → Query Shard2
Advantage: Flexible, easy to rebalance without changing shard logic.
Disadvantage: The directory becomes a single point of failure. Every query has two steps (lookup + actual query).
Choosing the Right Shard Key
The shard key is the most critical decision in sharding. A bad shard key creates hotspots that overload some shards while others remain empty.
| Shard Key Choice | Problem | Better Alternative |
|---|---|---|
| Timestamp (created_at) | All new data hits the latest shard | User ID + timestamp combination |
| Country Code | USA has 100x more users than Luxembourg | Hashed user ID (ignores geography) |
| Sequential ID | New IDs always land on last shard | Hashed ID distributes evenly |
| User ID (hash) | Range queries cross all shards | Accept this for even distribution |
Cross-Shard Queries (The Hard Problem)
Sharding breaks joins and cross-record queries. If users live across 10 shards, finding all users who placed orders in the last 7 days requires querying all 10 shards and merging results.
Without sharding: SELECT * FROM users WHERE last_order_date > '2024-01-01' → One database query, simple With sharding (10 shards): Query Shard 1: SELECT * FROM users WHERE last_order_date > '2024-01-01' Query Shard 2: same query ... Query Shard 10: same query → Merge all 10 results in application code → Much slower, more complex
Solutions:
- Avoid cross-shard joins by designing queries around the shard key
- Duplicate frequently joined data across shards (denormalization)
- Use a separate analytics database (data warehouse) that holds aggregated data across all shards
Resharding: Adding More Shards
As data grows, more shards are needed. Resharding is the process of redistributing data across a new shard configuration. It is one of the most operationally difficult tasks in distributed databases.
Before: 4 shards (each holding 25% of data) After: 8 shards (each holding 12.5% of data) Data migration required: Shard 1 (IDs 1-250M) → Split into new Shard 1 (1-125M) + new Shard 5 (125M-250M) This migration must happen live, with zero downtime. Consistent hashing minimizes this — only 1/N of data moves when adding 1 new shard.
Partitioning vs Replication
| Aspect | Partitioning (Sharding) | Replication |
|---|---|---|
| Purpose | Handle more data and writes | Handle more reads and ensure availability |
| Data Distribution | Different data on each node | Same data copied on each node |
| Read Performance | Better for distributed reads | Better for read-heavy workloads |
| Write Performance | Excellent (writes go to one shard) | Slower (all replicas must be updated) |
| Fault Tolerance | Low alone (losing one shard = data loss) | High (copies survive node failure) |
Production systems combine both: shard the data for scale, and replicate each shard for reliability.
Summary
Sharding enables databases to scale beyond the limits of a single machine by distributing data across multiple servers. The choice of sharding strategy — range, hash, consistent hash, or directory — depends on query patterns and growth expectations. The shard key decision determines whether data distributes evenly or creates hotspots. Cross-shard queries and resharding operations remain the most complex challenges in sharded systems. Combining sharding with replication gives a system both horizontal scale and fault tolerance.
