How We Use Vale to Improve Our Documentation Editing Process
Learn how Datadog’s Documentation team uses a linter to shift quality left.
How we implemented CPU and wall time profiling in our .NET continuous profiler.
Our .NET profiler was designed and implemented to run 24/7 in production, at any scale, with negligible impact. Here are the details of how we built it.
How Datadog's Frontend DevX team migrated a codebase from flaky, hard-to-maintain acceptance testing with Puppeteer to more robust Synthetic tests.
This post walks through how we restored our platform after it was affected by the outage of March 8, 2023.
This post sketches out our incident response process, where it succeeded and where it stumbled on March 8, and what we learned along the way.
Learn how we tackled a case of high network-latency in our usage estimation platform that required a multi-layered solution.
A deep dive into what happened at the platform level during the outage of March 8, 2023.
Learn how we developed a new scheduling algorithm for data fetching and rendering and how we built it for use across our suite of Datadog products.
A closer look at storage routing in Husky, Datadog's third-generation event storage system.
We’ve recently improved the raw performance of the Datadog Agent, leading to 20% less CPU use on Agents flooded with custom metrics.
Learn about Datadog's repeatable design elements that we've documented in our design style guide called DRUIDS.