
Eric Metaj
Flaky tests are a significant source of inefficiency that impacts many engineering teams. Along with failing your build, they interrupt your entire development flow, generate excessive CI/CD noise, and, critically, compromise developer trust in the test suite itself.
Datadog Test Optimization enables you to manage test suites at scale by pinpointing the flakiest tests, analyzing their history across hundreds of runs, and automatically surfacing the root cause. But identifying the problem is only the first step. Fixing the root cause still requires dedicated developer time to debug and rewrite the test code, and this iterative maintenance debt is time that could be spent on building.
That’s why the Bits AI Dev Agent deeply integrates with the rich observability data and historical test context of Test Optimization. When a flaky test is detected, the Bits AI Dev Agent categorizes the problem and autonomously generates a verified code fix packaged as a production-ready pull request (PR) for immediate review.
In this post, we’ll cover how the Bits AI Dev Agent and Datadog Test Optimization integrate directly within your existing development workflow to reduce the repetitive, time-consuming work of fixing flaky tests.
Integrate quality context into your workflows for reliable AI-driven fixes
While a generalized AI model may suggest a simple change based on common problems, it can’t understand why your specific test failed intermittently across different CI runs. Datadog Test Optimization already collects deep, high-signal data from your pipelines—such as historical run information, execution traces, and logs from the exact moment of failure—that allows our agents to do more than guess.
By taking context further, Datadog’s AI-powered Flaky Test Management handles the initial categorization by grouping failures by type. The Bits AI Dev Agent then uses this categorization to autonomously triage the issues, focusing only on high-value types where a code fix is highly probable. It consumes rich historical data and full execution traces to accurately diagnose the root cause, allowing it to generate a robust, targeted code resolution. Because the Bits AI Dev Agent works continuously in the background, these fixes are pre-generated as soon as a test flakes, so developers get verified pull requests immediately, without waiting for an on-demand AI to respond.

Instead of just generating code, the Bits AI Dev Agent helps you generate reliable fixes that are trusted, verified, and embedded into existing workflows.
Build trust with proactive verification into every PR
For any AI-generated code, verification is critical. We know a fix is only useful if you can trust that it won’t introduce any new uncertainty. The Bits AI Dev Agent is engineered for developer confidence by proactively verifying fixes before they enter the merge queue. It focuses its scope on only high-confidence flakiness issues to minimize noise and unnecessary PRs.

This confidence stems from pre-merge validation, where the Bits AI Dev Agent runs an internal process using existing CI logic to check the generated fix against historical flakiness data. The Bits AI Dev Agent delivers this verified patch as an actionable draft PR, transparently labeled as an “Attempt to Fix,” complete with a contextual summary detailing the diagnosis, the fix applied, and the successful verification result (e.g., passing multiple consecutive runs).

Teams can enable auto-push settings that allow the Bits AI Dev Agent to send merge-ready PRs, push updates based on PR comments, and automatically resolve failed CI jobs. This process saves developer time, so you no longer need to annotate fixes, paste code, or manually retry CI jobs. You simply review a verified solution and merge.
Start fixing flaky tests with the Bits AI Dev Agent
By integrating the autonomous resolution capabilities of the Bits AI Dev Agent with the flakiness detection of Test Optimization, Datadog delivers a full-cycle solution to test instability. This means you can finally stop the reactive cycle of maintenance and reclaim your time for building.
You can try out the Bits AI Dev Agent by signing up for the Preview. To learn more about setting up your repository and enabling the agent, check out our documentation. Once enabled, the Bits AI Dev Agent will automatically identify flaky tests, generate validated fixes, and draft PRs for your team to review.
Start using the Bits AI Dev Agent today to spend less time triaging flaky tests and more time building. If you’re new to Datadog, sign up for a 14-day free trial to get started.





