
Failure is inevitable: Learning from a large outage, and building for reliability in depth at Datadog
After a major outage, we re-architected Datadog systems to degrade gracefully under failure. Here's what we learned—and how we’re building forward.