05
Distributed locks & fencing tokens
When the contended resource lives across services, you reach for a distributed lock (Redis, ZooKeeper, etcd). These are far trickier than in-process locks, because a holder can pause — a long GC, a network stall — past the lock's expiry, then wake up still believing it holds the lock while someone else has taken it.
The fix is a fencing token: a monotonically increasing number issued with each lock grant. The protected resource records the highest token it has seen and rejects any write carrying a lower one — so a stale holder's writes bounce, even if its lock looks valid.
Fencing token rejects the stale holder
Client 1
token 33
→
pauses (GC)
→
Client 2
token 34
→
Storage: 33 < 34
reject Client 1
→ Key insight
A lock with a timeout is a lease, not a guarantee. Without a fencing token, any distributed lock can be held by two clients at once. Idempotency (see consensus & coordination) is the other half of staying correct under retries.