Viết về data-driven system design

Writing system design by data flow to understand to design your system design better.

1. Scalability

  1. Vertical vs. Horizontal Scaling

  2. Partitioning — Divide and Conquer

  3. Replication — Redundancy for Speed and Safety

  • Synchronous Replication

  • Asynchronous Replication

  • Leader-Follower Model

  1. Caching — The Shortcut to Scale
  • In-Memory Caches

  • Edge Caches

  • Application Caches

  • TTL (Time to Live)

  • Write-through

  • Cache-aside

  1. Elasticity — Scaling as a Living System
  • Auto-scaling groups

  • Serverless platforms: event-driven.

  • Container orchestration (Kubernetes)

  • Observability and automation.

  1. Load Distribution and Partitioning
  • Traffic Distribution

  • Data Distribution

  • Task Distribution

  1. Sharding — Divide and Conquer
  • Throughput

  • Storage

  • Isolation

  1. Consistent Hashing

  2. Data Locality

  3. Balancing the System — Avoiding Hotspots

2. Bottlenecks & Backpressures

  1. Buffering — The Elastic Pause

  2. Batching — Efficiency Through Grouping

  3. Flow Control Strategies

  • Credit-Based Flow Control: maximum number of consumer can handle.

  • Rate Limiting

  • Reactive Streams: producer request N messages -> consumer consume N messages.

  1. Consumer Lag
  • Use bounded queues

  • Monitor queue depth, latency, and throughput

3. Consistency

  1. CAP Theorem

  2. Eventual Consistency

  3. The Raft Algorithm — Consensus + The Paxos

  • Leader election

  • Log replication

  • Safety guarantees

  1. Coordination Beyond Consensus
  • Distributed Locks

  • Barriers

  • Leases

  1. Quorums — The Math Behind Consensus
May 14, 2026