Engineering | Datadog Official Blog
blog post image

2023-03-08 Incident: A Deep Dive into Our Incident Response

This post sketches out our incident response process, where it succeeded and where it stumbled on March 8, and what we learned along the way.

blog post image

Not Just Another Network Latency Issue: How We Unraveled a Series of Hidden Bottlenecks

Learn how we tackled a case of high network-latency in our usage estimation platform that required a multi-layered solution.

blog post image

2023-03-08 Incident: A Deep Dive into the Platform-level Impact

A deep dive into what happened at the platform level during the outage of March 8, 2023.

blog post image

Making Fetch Happen - Building a General-purpose Query & Render Scheduler

Learn how we developed a new scheduling algorithm for data fetching and rendering and how we built it for use across our suite of Datadog products.

blog post image

Husky: Exactly-Once Ingestion and Multi-Tenancy at Scale

A closer look at storage routing in Husky, Datadog's third-generation event storage system.

blog post image

Performance Improvements in the Datadog Agent Metrics Pipeline

We’ve recently improved the raw performance of the Datadog Agent, leading to 20% less CPU use on Agents flooded with custom metrics.

blog post image

DRUIDS, the Design System that Powers Datadog

Learn about Datadog's repeatable design elements that we've documented in our design style guide called DRUIDS.

blog post image

Engineering Spotlight: Tay Nishimura

Engineering Spotlight: Tay Nishimura

blog post image

Introducing Husky, Datadog's Third-Generation Event Store

Husky is an unbundled, distributed, schemaless, vectorized column store. Here's how we built it—and why.

blog post image

How Datadog's IT Team Automated Account Inactivity and SaaS Spend Management

Employees at all modern software companies use a ton of outside pieces of software to do their jobs. Learn how Datadog's IT team expanded Clarity to automate monitoring these accounts for inactivity and optimizing how much we spend on them.

blog post image

It's always DNS . . . except when it's not: A deep dive through gRPC, Kubernetes, and AWS networking

The story of a seemingly simple issue that led us into the hidden complexities of gRPC, DNS, and Kubernetes.

blog post image

Using the Dirty Pipe Vulnerability to Break Out from Containers

See Datadog's proof of concept exploit for breaking out from unprivileged containers using the Dirty Pipe vulnerability.