Monitor machine learning models with Fiddler's offering in the Datadog Marketplace

Author Aaron Kaplan

Published: June 14, 2023

With the growing utilization of AI, modern business applications rely more and more on machine learning (ML) models. But the complexity of these models poses significant challenges to data scientists, engineers, and MLOps teams seeking to maintain and optimize performance. Fiddler is an AI observability platform that continuously monitors all of your ML models in training and production, provides real-time alerts for performance and data issues, and enables you to effectively analyze the root causes of often hard-to-pin-down issues. By making ML models more transparent, Fiddler also makes them more reliable, helping to establish trust in AI among engineering and operations teams as well as business leaders and regulators.

We’re pleased to now offer an out-of-the-box (OOTB) Fiddler integration and software license in the Datadog Marketplace. This integration enables you to centralize your monitoring of ML models and the applications that utilize them within one unified platform.

In this post, we’ll discuss how to:

Continuously monitor and alert on ML model performance

The integration’s OOTB Fiddler dashboards enable ML teams to continually track performance, data drift, data integrity, and traffic for each of their models using a range of key metrics. By alerting on these metrics, ML teams can proactively respond to the first signs of degradation or data drift rather than discovering the inaccuracy of their models’ predictions only after it has become glaringly obvious. And by seamlessly correlating these insights with Datadog Application Performance Monitoring (APM) and Real User Monitoring (RUM), teams can centralize and deepen their visibility into the health and performance of their ML-based applications. This centralized visibility gives teams the context they need to effectively troubleshoot issues and optimize performance.

The out-of-the-box Fiddler integration dashboard foregrounds key model performance metrics

Teams can also create as many additional dashboards as they need and customize each one to track specific performance metrics for each of their models—or to track multiple models at once in order to continually compare their performance and usage.

Determine the root causes of model performance issues

This integration provides a wide range of key metrics that enable you to identify suboptimal model performance and get to the root of issues. Different metrics may be particularly useful to track for different types of tabular and unstructured models, such as binary and multi-class classification models, regression models, ranking models, and natural language processing (NLP) or computer vision (CV) models. Fiddler also allows you to track a number of other factors that might affect your models’ performance:

  • Model performance metrics such as accuracy, precision, recall, and F1 scores
  • Data integrity metrics, which detect and measure frequently overlooked inconsistencies in the data ingested by models and the resulting errors (such as missing-value violations, broken data pipelines, or any other violating events)
  • Model drift metrics, which gauge discrepancies between the data that models have been trained on and the data they encounter in production.
  • AI fairness metrics, which assess model bias caused by skewed training data
  • Service metrics, such as traffic

Users can incorporate these metrics into their Datadog dashboards and set up alerts for specific thresholds and conditions. If needed, they can then pivot to the Fiddler platform for further in-depth analytics.

The Fiddler platform offers further insights into ML model performance

Beyond measuring model metrics, the Fiddler platform aims to make the often inscrutable inner workings of ML models intelligible using model diagnostics and explainability methods. By cultivating transparency in ML models, Fiddler can help ensure compliance and allay the hesitancy of business teams to rely on these models for business-critical applications.

Gain enhanced visibility into your ML models today

The Fiddler integration in the Datadog Marketplace enables you to proactively monitor and troubleshoot the performance of your ML models so you can identify and troubleshoot issues before they take a toll on your applications and business. To get started, sign up for a 14-day trial of the software license in the Datadog Marketplace and install the free integration. If you’re new to Datadog, you can sign up for a 14-day .

