VLLM Observability & Monitoring | Datadog

vLLM Observability & Monitoring

Gain comprehensive visibility into the performance and resource usage of your LLM workloads.

dg/vllmheader

A unified monitoring platform provides full visibility into the health and performance of each layer of your environment at a glance. Datadog allows you to customize this insight to your stack by collecting and correlating data from more than 900 vendor-backed technologies, all in a single pane of glass. Easily monitor your underlying infrastructure, supporting services, applications alongside security data in one centralized monitoring platform.

 


Next-generation ML Monitoring

Monitor and your entire machine learning stack with Datadog.

watchdog-apm-illustration.png

AWS Trainium & Inferentia

Monitor and optimize deep learning workloads running on AWS AI chips

tracesearch-apm-illustrationv2.png

OpenAI

Monitor token consumption, API performance, and more.

servicemap-apm-illustration.png

NVIDIA DCGM Exporter

Gather metrics from NVIDIA’s discrete GPUs, essential to parallel computing.

Loved & Trusted by Thousands

Washington Post logo 21st Century Fox Home Entertainment logo Peloton logo Samsung logo Comcast logo Nginx logo

ML Monitoring Resources

Learn about how Datadog can help you monitor your entire AI stack.

Datadog AI Monitoring Starter Kit