Viết về data-driven system design

Writing system design by data flow to understand to design your system design better.

1. Scalability

  1. Vertical vs. Horizontal Scaling

  2. Partitioning — Divide and Conquer

  3. Replication — Redundancy for Speed and Safety

    • Synchronous Replication

    • Asynchronous Replication

    • Leader-Follower Model

  4. Caching — The Shortcut to Scale

    • In-Memory Caches

    • Edge Caches

    • Application Caches

    • TTL (Time to Live)

    • Write-through

    • Cache-aside

  5. Elasticity — Scaling as a Living System

    • Auto-scaling groups

    • Serverless platforms: event-driven.

    • Container orchestration (Kubernetes)

    • Observability and automation.

  6. Load Distribution and Partitioning

    • Traffic Distribution

    • Data Distribution

    • Task Distribution

  7. Sharding — Divide and Conquer

    • Throughput

    • Storage

    • Isolation

  8. Consistent Hashing

  9. Data Locality

  10. Balancing the System — Avoiding Hotspots

2. Bottlenecks & Backpressures

  1. Buffering — The Elastic Pause

  2. Batching — Efficiency Through Grouping

  3. Flow Control Strategies

    • Credit-Based Flow Control: maximum number of consumer can handle.

    • Rate Limiting

    • Reactive Streams: producer request N messages -> consumer consume N messages.

  4. Consumer Lag

    • Use bounded queues

    • Monitor queue depth, latency, and throughput

3. Consistency

  1. CAP Theorem

  2. Eventual Consistency

  3. The Raft Algorithm — Consensus + The Paxos

    • Leader election

    • Log replication

    • Safety guarantees

  4. Coordination Beyond Consensus

    • Distributed Locks

    • Barriers

    • Leases

  5. Quorums — The Math Behind Consensus

4. Reliability and Fault Tolerance

May 14, 2026