Maintaining performant service communication is critical to ensuring that our customers are able to best support their customers through our platform, so we needed a way to detect when DNS issues occurred—quickly and reliably.
Vicente De Luca
Principal Engineer at Zendesk
We really like how Datadog's NPM product is tied into the rest of the platform. Being able to monitor network traffic all the way down to our containers with Datadog has helped us identify improvements and optimizations across our platform.
Principal SRE at Cvent
Network monitoring with Datadog provides full visibility into every layer of your cloud, on-premise, or hybrid environment. Network Performance Monitoring lets you monitor your network architecture alongside application, infrastructure, and DNS performance for faster troubleshooting. Network Device Monitoring provides insight into the health and performance of bare-metal devices such as routers, firewalls, and switches.
Live network mapping and analysis with Network Performance Monitoring
- Analyze traffic as it flows across applications, containers, availability zones, and on-premise servers
- Identify cross-pod communication issues and troubleshoot inefficient load balancing
- Pinpoint the services and teams responsible for abnormal spikes in traffic
Analyze network traffic between meaningful endpoints—not just IPs
- Monitor the health of traffic between any two endpoints at the app, IP, port, and PID layers
- View communication between services, pods, cloud regions, and cloud resources
- Track key network metrics like TCP retransmits, latency, and connection churn
Deep visibility into DNS performance
- Analyze system-wide DNS performance without having to SSH into individual machines
- Assess DNS server health with request volume, response time, and error code metrics
- Differentiate between client-side code errors and server-side DNS failures
Health and performance insights into any on-premise device with Network Device Monitoring
- Autodiscover any device on any network or subnet, with support for brands like Cisco, HP, Dell, F5, Juniper, and more
- Aggregate metrics across all devices by location—and quickly drill down to view the health of specific interfaces
- Proactively monitor device health with machine learning–based alerts; forecast bandwidth utilization and get alerted before problems arise