Modern datacenters can contain thousands of network appliances, such as routers, switches, firewalls, and servers, so it’s important for your monitoring strategy to provide comprehensive visibility into every piece of your infrastructure. Datadog Network Device Monitoring already allows you to collect a wealth of telemetry from all of your SNMP-managed devices, which are automatically discovered by the Datadog Agent. Now, you can quickly visualize these metrics with out-of-the-box dashboards for datacenter and device performance, enabling you to identify and troubleshoot a range of issues on your on-prem or hybrid network before they impact your users.
The Datacenter Overview dashboard provides high-level, out-of-the-box visibility into the health and performance of your devices across all locations, including datacenters and campus sites. For instance, you can quickly see how many devices are up and responding—and view an inventory of all monitored devices, which is organized from the bottom-up by device uptime. You can also use tag-based filters to zero in on devices from a particular location.
Keeping these metrics close at hand enables you to swiftly identify and investigate any concerning activity in your on-prem or hybrid network. For example, a sudden decrease in the total number of responding devices could signal a connectivity issue at a particular site. Additionally, if all of the devices at a site suddenly have their uptimes reset, it could be indicative of a sitewide power loss.
The Datacenter Overview dashboard also includes graphs that apply the forecasting function to the inbound and outbound bandwidth metrics in order to predict when specific interfaces will exceed their limit of available bandwidth based on historical performance. For example, if your team runs a routine remote backup on the last Friday of the month, but this month, your general network usage has been much higher than usual, the forecast function will predict an upcoming spike in bandwidth when the system backup is scheduled to occur. This increase in usage could lead to greater network latency, which could make your network unusable during this time.
Additionally, you can create forecast monitors for these predictive models to make sure that you are alerted of any impending spikes. That way, you can troubleshoot network issues before they occur to avoid any negative customer impacts.
If you’re reviewing the Datacenter Overview dashboard and you notice that a specific device is consuming more resources than usual, is failing to respond to requests, or is forecasted to hit 100 percent of its available bandwidth utilization, you can use a dashboard custom link to pivot to the Interface Performance dashboard. The Interface Performance dashboard visualizes key health and performance metrics for a specific device, such as the number of octets sent and received, the number of errors and discards on an interface, and total throughput. This device-level visibility will allow you to determine if, for example, an increase in network latency is the result of a spike in errors on a particular interface, or if the device is discarding all received data.
The custom link you used as an entry point will preserve the template variables, tags, and attributes that you selected when viewing the Datacenter Overview dashboard, so the widgets on the Interface Performance dashboard will be pre-populated with data from the same context as your initial investigation.
The new out-of-the-box dashboards for datacenters and devices help you visualize key metrics from your on-prem or hybrid network immediately after you install the Datadog Agent and configure the SNMP integration. If you’re already a Datadog customer and you’d like to get started with Datadog Network Device Monitoring, check out our documentation. New to Datadog? Get started with a 14-day free trial.