Superwise is a monitoring platform that provides model observability for high-scale machine learning (ML) operations. Superwise provides teams with out-of-the-box (OOTB) metrics on their models’ production behavior, so they can effectively address drift, data quality issues, and other problems before they negatively impact the business.
Datadog’s Superwise integration automatically generates metrics based on the data entities that are specific to your model. Once you’ve set up the integration, your metrics will begin flowing into an OOTB dashboard in Datadog, so you can immediately visualize trends without any manual configuration. In this post, we’ll cover how to gain visibility into your model activity and drift with dashboards and how to configure incident monitoring with Superwise’s policies.
A model is only as accurate as the data set it was trained on, which makes it essential for your production data to closely mirror your model baseline. If your data distribution begins to drift from your baseline, your model may reflect these changes in its prediction accuracy and recall. Superwise’s flexible policy builder offers a wide range of templates that can help you monitor model drift and other issues. You can create customized policies based on a wide range of Superwise metrics, such as changes in your model’s performance, data quality, activity, drift, and your own performance metrics and business KPIs.
Once you’ve configured a policy, Superwise will scan for anomalies within the chosen logic (policies allow for any and/or segment, metric, and feature combination), allowing you to dynamically monitor your model without having to manually configure thresholds. Performance degradation and prediction shift policies allow you to detect when your model falls below production-standard levels. You can also configure missing value and outlier policies to monitor the quality of your distribution data.
The OOTB Superwise dashboard provides a quick overview of the number of active models, their drift, and total predictions over time. With these metrics, you can determine when high-impact models need to be retrained to improve their accuracy.
For a more flexible view, you can clone and customize the dashboard by adding or removing widgets to showcase metrics of your choice. For example, you can customize your dashboard to show overall input drift from a specific model or use it to monitor your models alongside other integrated services your infrastructure depends on.
If Superwise detects a violation of any of your configured policies, it will automatically create an incident. When correlated policy violations are detected, Superwise will aggregate them into a single incident to reduce noise and provide a focused view into your model’s issues. You have full control over which incidents get sent to Datadog to ensure that each incident reaches the appropriate channel. To keep track of Superwise incidents in Datadog, the OOTB dashboard widgets display the number of models with ongoing incidents, their incident distribution, and allow you to home in on individual incidents for more details.
Once you’ve configured Datadog as a notification channel for Superwise incidents, they will begin to appear in Datadog Incident Management. With cross-platform visibility, your team can triage and analyze the downstream impact of model issues by correlating Superwise metrics with other data from your environment.
With Datadog’s Superwise integration, it is easier than ever to monitor your ML models at enterprise scale. You can now track trends in your models’ performance, data quality, and drift straight from Datadog, alongside the other services your infrastructure depends on. To get started, install the Superwise integration and sign up for a Superwise subscription from the Datadog Marketplace. If you’re not already a Datadog customer, sign up today with a free 14-day trial.
The ability to promote branded marketing tools is a membership benefit offered through the Datadog Partner Network. You can learn more about the Datadog Marketplace in this blog post. If you’re interested in developing an integration or application that you’d like to promote, you can contact us at firstname.lastname@example.org.