Cleo Health builds trusted acute care AI solutions at scale using Datadog observability

The challenge of scaling AI in healthcare

Cleo Health builds AI-powered tools for emergency departments and inpatient hospital settings where patients are actively sick and systems have no margin for error. Focused exclusively on care delivered in a hospital setting, Cleo designs its platform for real-world clinical pressure.

“We sit squarely in acute care,” says CTO Ben Rosand. “These are urgent environments where things have to work, every time.”

What began as an AI scribe quickly expanded as customers asked Cleo to take on broader inpatient challenges. Today, the platform also supports charge capture, census management, clinical documentation integrity, and coding. Each capability must operate reliably across fast-paced, unpredictable workflows, often at the same time.

This growth has been driven by deep customer partnerships, particularly with large staffing organizations representing a significant share of emergency room physicians in the United States. As Cleo scaled across hundreds of hospitals and began supporting more than three million patients per year, the core challenge became clear. Cleo needed to deliver AI that performs consistently in the most demanding clinical environments, at scale, with no room for failure.

Powering AI for the most demanding moments

Operating in healthcare means trust is foundational. Cleo’s customers expect security, compliance, reliability, and auditability from day one. At the same time, Cleo runs many AI models and experiments simultaneously to provide real-time feedback that helps clinicians avoid missed details and reduce cognitive load.

“We run a lot of AI experiments at once,” says Rosand. “Datadog gives us the visibility we need to understand how our models behave in production and build our AI with confidence.”

As a bootstrapped company, Cleo intentionally keeps its team small to preserve speed and accountability. That makes fast detection and resolution of issues essential. “We want people empowered to move fast,” Rosand adds. “But we also need confidence that nothing is breaking underneath.” The Datadog API has also enabled rapid iteration of Cleo’s internal AI support agents, allowing the team to programmatically query observability data and continuously improve how issues are investigated and resolved.

Building with Datadog for Startups

Cleo adopted Datadog early through the Datadog for Startups (DDFS) program. As a bootstrapped company, cost was a real constraint. The program gave Cleo access to the full Datadog platform during its earliest stage, enabling the team to instrument deeply from the beginning.

“The fact that the Datadog for Startups program was free early on was huge for us,” Rosand says. “It gave us comprehensive visibility when we otherwise could not have afforded it.”

With prior experience using Datadog, the team relied on it from day one to understand system behavior across their Azure environment, spanning applications, infrastructure, networking, databases, and real user experience. Datadog quickly became central to how Cleo builds and operates.

As Cleo scaled, Datadog evolved from a debugging tool into a system of record. It is the first place teams look when a user reports an issue. Engineers use APM, RUM, logs, dashboards, and alerting to understand exactly what is happening and who is impacted. “Datadog is where everything comes together for us,” says Rosand. “It is the source of truth we trust.” RUM, in particular, enables Cleo to immediately trace reported issues back to specific user sessions, helping teams reproduce bugs and diagnose latency problems in real time.

Datadog data also feeds Cleo’s internal AI tooling. Logs, APM traces, RUM session data, and database metrics are programmatically queried by long-running AI agents, which dynamically choose what signals to search based on the issue at hand. Long-running AI agents query Datadog alongside other internal systems to answer operational and customer questions quickly. For example, when a bug is reported, an agent can look up the affected user, identify the relevant features involved, analyze traces and logs to determine exactly what happened, and even review RUM sessions to understand front-end latency or interaction issues.

Datadog is also vital for maintaining HIPAA compliance. Centralized log retention and management are key components of Cleo’s compliance posture. All logs are routed through Datadog to control, monitor, and aggregate data before being sent to deep storage, ensuring auditability and strong governance across environments. Protected Health Information (PHI) is intentionally excluded.

Clear results for teams and customers

With Datadog as a centralized observability platform for their internal AI agent, Cleo reduced root cause analysis for some issues from hours to minutes, representing a 90 percent drop. Faster investigation empowers support teams to respond independently and strengthens trust with enterprise healthcare customers. “Fast RCA changes everything,” Rosand notes. “It directly impacts how responsive we can be to our customers.”

Datadog also played a key role during major milestones, including system migrations and early database load challenges. Database Monitoring helped diagnose performance issues before they affected clinicians, reinforcing reliability as Cleo expanded across hundreds of hospitals.

That reliability has supported meaningful growth. This year, Cleo will serve over seven million patients across more than 400 hospitals. Cleo’s customers represent roughly 40% of Emergency Medicine clinicians and 15% of Hospital Medicine clinicians in the United States. The company ended 2025 with a Net Promoter Score (NPS) of 72, reflecting strong enterprise customer satisfaction driven by demonstrated outcomes. Notably, one partner health system achieved $243 in recovered revenue per inpatient stay via Cleo’s revenue cycle product line.

At the clinician level, the impact is equally measurable. 94% report reduced documentation burden after adopting Cleo, saving an average of 54 minutes per shift. Case studies have shown that 90% of clinicians show improvements in patient experience evaluations after implementing Cleo’s real time AI feedback.

Together, these results demonstrate how a strong observability foundation enables Cleo to scale responsibly while delivering measurable improvements for both health systems and frontline clinicians.

As Cleo continues to scale, the team plans to consolidate more tools into Datadog, expand logging and tracing, and deepen integration between internal AI systems and Datadog APIs.