LEARNING PATH · Systems & Backend

Reliability & Scale (SRE)

For engineers who keep systems up as traffic grows.

The reliability engineer’s toolkit: turn nines into real downtime, size capacity before it bites, then study the patterns — quorums, rate limiting, caching, queues — that absorb load and keep services alive under pressure.

  • Translate availability targets into real downtime budgets
  • Estimate capacity and headroom before launch
  • Tune quorum and consistency trade-offs deliberately
  • Apply rate limiting, caching and queues to shed and absorb load
0 / 5 done · 0%
  1. HandbookNext up

    The System Design Fundamentals Handbook

    Ground the vocabulary first.

  2. ToolTool · optional

    Availability (Nines) Calculator

    Turn nines into real downtime.

  3. ToolTool · optional

    Interactive Capacity Estimator

    Size load before it bites.

  4. Handbook

    Partitioning, Sharding & Replication

    Scale and survive node loss.

  5. ToolTool · optional

    Quorum (N/R/W) Explorer

    Tune N/R/W consistency vs availability.

  6. System Design

    Design a Rate Limiter

    Shed load and protect services.

  7. System Design

    Design a Distributed Cache

    Absorb read load at scale.

  8. System Design

    Design a Message Queue

    Decouple and ride out traffic spikes.

← All learning paths