Caching strategies that survive invalidation reality

Caches trade freshness certainty for cost or latency amortization. The failure mode is seldom “mis-hit ratio bad”—it reads as subtle revenue leaks (stale pricing), silent policy violations (premature ACL artifacts), intermittent bugs that disappear from logs before you correlate them.

Name your layers deliberately

Layer	Typically owns	Primary invalidation knobs
Browser	Private assets	`Cache-Control`, ETag
CDN / Edge	Anonymous GET bodies	TTL, surrogate tags, purge APIs
Reverse proxy	Compression plus micro-cache	keyed routes, timeouts
App server	Aggregated computations	TTL plus events plus manual bust
Data store replicas	Reads behind leaders	replication lag allowances

Skipping explicit ownership breeds cross-team friction: one team purges the proxy while another CDN still serves an old blob.

Stampede control

Thundering herds hit origin when hot keys expire together. Mitigations include probabilistic early refresh, locking or single-flight wrappers, jittered TTLs, and a thin layer of in-process memoization for bursty identical reads.

Raw hit-rate without context can lie. Cheap static assets tolerate low hit ratios; authoritative pricing reads do not.

Domain-aligned keys

Use version namespaces inside keys (pricing:v2026-03-01), include tenant identifiers, and avoid accidental cross-tenant reuse.

Prefer explicit event-driven eviction (publish “invalidate X” after writes) instead of guessing a global TTL and hoping the business agrees.

Observability

Measure origin offload, how often stale data is explicitly served under policy, and eviction latency—not vanity counters alone. Tie spikes to deploys; partial rollouts often break warming assumptions.

Document staleness budgets in runbooks—for example “search index may lag writes by up to 60s”—so incidents inherit memory instead of improvisation.

Caching is a product decision encoded in infrastructure. Treat it that way.