Scaling Writes
- Scale writes with vertical scaling, sharding, queues, batching, and hierarchical aggregation.
30-second elevator pitch: "Write scaling is harder than read scaling. I start with vertical scaling and write-optimized databases like Cassandra. When that is not enough, I shard by partition key. For bursts I use queues and load shedding. For high volume I batch writes and use hierarchical aggregation."
The Problem
Read scaling has clear tools: replicas, caching. Write scaling hits hard limits: disk I/O, CPU, network. A single database might handle 1,000 writes/second. Viral posts, location updates, metrics - millions of writes per second.
Write-heavy systems include: YouTube Top K, Strava, Rate Limiter, Ad Click Aggregator, FB Post Search, Metrics Monitoring.
What You Will Learn
Tier 1: Vertical and database choice
- Better hardware (more cores, SSDs)
- Write-optimized DBs (Cassandra append-only, time-series)
>
Tier 2: Sharding
- Partition by key (user_id, topic)
- Consistent hashing for distribution
>
Tier 3: Burst handling
- Queues to smooth spikes
- Load shedding (drop lowest-value writes)
>
Tier 4: Reduce write volume
- Batching (amortize overhead)
- Hierarchical aggregation (fan-in, reduce before writing)
The Solution: Four Strategies
What interviewers want to hear: "I exhaust vertical scaling and database tuning first. Then I shard by partition key to spread load. For bursts I use queues. For extreme volume I batch and aggregate to reduce the number of writes."
Vertical Scaling and Database Choice
Hardware - SSDs instead of HDDs, more RAM, faster NICs. Systems with 200 cores and 10Gbps links are common. Do back-of-the-envelope math: what is your write throughput vs hardware limits?
Database choice - Cassandra uses append-only commit logs; 10,000+ writes/sec on modest hardware vs ~1,000 for a traditional RDBMS. Time-series DBs (InfluxDB, TimescaleDB) optimize for sequential timestamped writes. Trade read performance for write performance.
Sharding and Partitioning
One server handles 1,000 writes/sec; 10 servers handle 10,000. Shard by partition key: user_id for posts, topic for Kafka. Consistent hashing distributes keys across nodes. Pick a key that spreads load evenly - avoid hot partitions.
Interview tip: Always specify your partition key. "We shard by user_id so each user's writes go to one shard."
Queues and Load Shedding
Queues - Accept writes into a queue; workers drain at sustainable rate. Smooths bursts. Trade-off: eventual consistency, added latency.
Load shedding - When overwhelmed, drop the least valuable writes. Uber: drop location updates that are seconds apart (next update is fresher). Analytics: drop impressions before clicks.
Batching and Hierarchical Aggregation
Batching - Group writes to amortize overhead. Application batches, or an intermediate service (e.g. like-counter that batches 100 likes into 1 count update). Database can batch flushes.
Hierarchical aggregation - For fan-out (millions of viewers, millions of comments), aggregate at intermediate nodes. Write processors merge updates before forwarding to root. Reduces writes at each hop.
When to Use in Interviews
Use when the problem is write-heavy: social posting, location updates, metrics, ad clicks, search indexing. Proactively identify write bottlenecks.
When NOT to use: Read-heavy systems. Modest write volume where vertical scaling suffices.
Summary
Vertical - Hardware and write-optimized DBs first
Sharding - Partition key, consistent hashing
Bursts - Queues, load shedding
Volume - Batching, hierarchical aggregation
{{SUBSCRIBE}}
{{BUTTON:Read More Articles|https://systemdesignlaws.xyz}}


