Viết về data-driven system design
Writing system design by data flow to understand to design your system design better.
1. Scalability
-
Vertical vs. Horizontal Scaling
-
Partitioning — Divide and Conquer
-
Replication — Redundancy for Speed and Safety
-
Synchronous Replication
-
Asynchronous Replication
-
Leader-Follower Model
- Caching — The Shortcut to Scale
-
In-Memory Caches
-
Edge Caches
-
Application Caches
-
TTL (Time to Live)
-
Write-through
-
Cache-aside
- Elasticity — Scaling as a Living System
-
Auto-scaling groups
-
Serverless platforms: event-driven.
-
Container orchestration (Kubernetes)
-
Observability and automation.
- Load Distribution and Partitioning
-
Traffic Distribution
-
Data Distribution
-
Task Distribution
- Sharding — Divide and Conquer
-
Throughput
-
Storage
-
Isolation
-
Consistent Hashing
-
Data Locality
-
Balancing the System — Avoiding Hotspots
2. Bottlenecks & Backpressures
-
Buffering — The Elastic Pause
-
Batching — Efficiency Through Grouping
-
Flow Control Strategies
-
Credit-Based Flow Control: maximum number of consumer can handle.
-
Rate Limiting
-
Reactive Streams: producer request N messages -> consumer consume N messages.
- Consumer Lag
-
Use bounded queues
-
Monitor queue depth, latency, and throughput
3. Consistency
-
CAP Theorem
-
Eventual Consistency
-
The Raft Algorithm — Consensus + The Paxos
-
Leader election
-
Log replication
-
Safety guarantees
- Coordination Beyond Consensus
-
Distributed Locks
-
Barriers
-
Leases
- Quorums — The Math Behind Consensus