The Monitor

Highlights from AWS re:Invent 2025: Making sense of applied AI, trust, and going faster

9 minute read

Published

Share

Highlights from AWS re:Invent 2025: Making sense of applied AI, trust, and going faster
Andrew Krug

Andrew Krug

After four days of AWS re:Invent—a 65,000-step marathon that included 60,000 attendees spread across five Las Vegas campuses—and navigating the latest installment of this 13-year-old cloud pilgrimage, we’re all a little dehydrated but significantly wiser. The volume of announcements felt less like a single flood and more like a river branching into three powerful currents. Making sense of this massive technological convergence requires zooming out.

Datadog at re:Invent 2025.
Datadog at re:Invent 2025.

This post is your retroactive field guide to connecting the dots that define the future of the cloud. Below, we’ll dive into the biggest announcements from three crucial domains:

Applied intelligence, and the promise of agentic AI

Everyone is building AI agents, but how do you debug a nondeterministic black box? If one thing was clear in re:Invent’s opening keynote, it’s that builders have more capability than ever to train models, build agents, and provide value by using technologies like AWS Trainium 3 and Amazon Bedrock AgentCore. This evolving technology lowers the barrier to entry for almost anyone to build faster than ever before. But how do you hold these platforms accountable when it comes to reliability, safety, and security?

Operationalizing accountability

The lack of confidence combined with the staggering MIT study from August 2025 that showed most business generative AI pilots are failing necessitates increased observability and security controls. Last year at DASH, we launched LLM Observability, a way to continuously observe, secure, and act on findings from the LLMs you use. Just in time for re:Invent, we also launched two features in LLM Observability to help builders see the full picture of their stack: support for Strands Agents and for Amazon Bedrock Agents, both generally available. In order to ensure both safety and security, we released a set of detection rules to help drive down risk in the configuration for Bedrock instances.

The AWS commitment to empowering their customers and partners to build safely on their AI platform was also echoed in the release of Bedrock AgentCore Policies, enabling teams to have fine-grained control of identity and access throughout their agentic workflows.

Experimenting at scale

When it comes to AI, builders now have more choice than ever. AWS dropped 18 fully managed open weight models for Bedrock—the largest simultaneous model expansion to date—spanning Gemma, Mistral, Qwen, NVIDIA Nemotron, and more. Pair that with the new serverless model customization capability in SageMaker AI, which lets teams fine-tune with reinforcement learning and direct preference optimization without managing infrastructure, and suddenly the number of variables explodes. This raises a new challenge: How do I tune all of these models for the best results?

LLM Observability now includes LLM Experiments and Playground, where you can experiment and optimize your LLM applications before pushing to production. In addition to model tuning, we released Prompt Tracking to help monitor performance, latency, and drift across prompt versions.

Trust, and the necessary governance and security to scale safely

According to this year’s State of Cloud Security, over 84% of customers use more than one AWS account and 3 in 4 (75%) of AWS accounts are part of an AWS Organization. At this complexity and scale, security isn’t just about firewalls; it’s about identity, governance, and sprawl. Trust is the foundation that enables scale.

This year’s re:Invent was all about making security accessible and enabling interoperability using standards. This year, we led a session on detection engineering at scale and highlighted the challenges of manually detecting threats along with the benefits of Detection as Code (DaC). But before that, two major announcements came in the days leading up to re:Invent—or as we call it, pre:Invent:

  • AWS IAM Policy Autopilot: An open source Model Context Protocol (MCP) server and command-line tool that helps your AI coding assistants quickly create baseline IAM policies that you can refine as your application evolves. Policy Autopilot supports all the major languages and IDEs, including Kiro.
  • AWS Security Agent: Launched in preview, this frontier agent is a step toward integrating more agentic application security into build workflows. Applied AI and triaging as many findings close to the IDE will certainly save AppSec teams a ton of time.

Prioritizing threats, not just alerts

IAM Policy Autopilot and the AWS Security Agent help teams build securely. Once you’re in production, you need visibility into how resources connect and which identities carry the most risk.

We launched two capabilities that help you understand your larger cloud footprint and prioritize risks using observability context. Security Graph visualizes critical attack paths, privilege escalation routes, and blast radii across your cloud infrastructure. A misconfigured Amazon S3 bucket becomes more than a compliance checkbox; you can see what data it exposes and who can reach it. And Cloud SIEM Risk Insights consolidates multiple data sources, such as SIEM threats and Cloud Security misconfigurations, to identify the riskiest IAM identities and prioritize fixes.

Enterprise velocity, and keeping pace with the agentic era

The agentic gold rush is changing both what we build and the speed at which we build it. When AI can generate code, summarize logs, and draft incident responses, the bottleneck shifts. It’s no longer the speed of writing but the speed of everything around the writing: reviewing code, deploying safely, catching vulnerabilities, and staying compliant. AI accelerates what’s possible, but your workflows decide what actually gets delivered.

Context without the context switch

A wave of re:Invent announcements centered on Kiro, AWS’s agentic IDE. Kiro focuses on moving from prototype to production, with AI agents that understand context across large codebases.

We launched three integrations to extend Kiro’s reach into production, bringing errors, deployments, traces, and incident context directly into the code editor:

  • Datadog MCP Server support in Kiro surfaces errors, deployments, and traces directly in your editor, shortening the feedback loop between code and production.
  • Our Datadog Kiro powers allow you to specialize your Kiro agents for observability use cases, enabling developers to customize their workflow for both speed and code quality.
  • Datadog MCP Server integration with AWS DevOps Agent allows the AWS DevOps Agent to learn your resources and relationships while correlating data from both AWS services and Datadog.

Deploy safely at speed

The Infrastructure Innovations keynote at re:Invent emphasized how developers are moving faster than ever. As AI-assisted coding accelerates what’s possible, the ability to experiment, iterate, and fail fast is essential.

But increased development velocity without guardrails is just risk with momentum. How do you ensure code is safe, deployments are solid, and the cycle keeps improving?

At re:Invent, we released new AI capabilities for Code Security to help developers catch vulnerabilities and fix them without slowing down. AI-driven detection and remediation detects code vulnerabilities, intelligently filters out false positives, and helps developers remediate at scale, while Secret Scanning detects and blocks exposed credentials leaked in code.

With your code secured, the next question is: How fast can you get it to production? AWS announced several capabilities that remove friction across the deployment life cycle:

  • Amazon ECS Express Mode lets you deploy containerized applications with a single command—load balancers, autoscaling, networking, and domains are provisioned automatically.
  • AWS Lambda durable functions bring automatic checkpointing and retries to long-running workflows, with execution that can pause for up to a year and resume exactly where it left off.
  • AWS Lambda Managed Instances removes the tradeoff between serverless simplicity and Amazon EC2 flexibility, allowing you to run Lambda functions on EC2 compute while AWS handles instance life cycle, patching, and scaling.

As these new deployment models gain traction, observability has to keep pace. We shipped support for these releases so you can maintain visibility across the full life cycle:

When everything gets faster, measurement matters more

AWS spent the week removing friction between idea and production, with agentic IDEs that understand your codebase, container deploys that provision themselves, serverless workflows that can pause for a year and pick up where they left off, and more. That’s great for velocity, but it creates a challenge for platform teams: more paths to production means more surface area to standardize.

Our Internal Developer Portal gives platform teams a way to keep up. Developers get self-service access to provision infrastructure and spin up new services, while Scorecards make sure what they ship meets production-readiness standards before it goes out the door. The new options from AWS become paths you can actually govern instead of sprawl you have to chase.

And once everything’s flowing, DORA Metrics and CI Visibility tell you whether it’s actually working. Are deployments getting more frequent? Is the lead time for breaking changes improving? Where are pipelines getting stuck? That’s how you turn a week of announcements into measurable progress.

On-site experience: Mandatory stopovers and the swag score

What really shapes the re:Invent attendee experience isn’t the scale but the mix of learning formats available throughout the week. This year, Builder Sessions let people get hands-on and build something in real time, while Chalk Talks created relaxed, whiteboard-based conversations where builders exchanged ideas and worked through real challenges together. Hands-on learning experiences like AWS GameDay provide the opportunity to solve real cloud engineering challenges in live environments, and this year, Datadog hosted a Cloud Architecture GameDay where brave players used observability to solve multiple challenges across a messy cloud environment.

Nick Hefty from Zendesk giving a builder session.
Nick Hefty from Zendesk giving a builder session.

Datadog also had the opportunity to play a part in the overall experience with a couple of sessions that drew strong interest from builders. Nick Hefty from Zendesk shared how his team keeps large multi-tenant systems running smoothly by constantly testing performance, watching for noisy neighbors, and staying ahead of potential bottlenecks. On the AI side, Kunal Batra led a session on building observable AI agents with AWS Strands and Bedrock AgentCore, breaking down the key challenges of running these agents at scale and showing how to operationalize the GenAI Lens of the Well-Architected Framework to address them.

Of course, no re:Invent journey is complete without swag. Between the expo and AWS shop, we left Las Vegas with heavier luggage and a refreshed wardrobe.

See you next year

If there’s one takeaway from this year’s re:Invent, it’s that the era of siloed approaches is over. Agentic AI, security and governance, and developer velocity aren’t three initiatives; they’re one, and scaling requires treating them as one. The future of cloud success belongs to unified platforms that treat these domains as a single operational challenge.

This was just a quick summary of everything that excited us at re:Invent 2025. Check out all of Datadog’s launches at re:Invent and read about even more announcements on the AWS News Blog. We hope to see you next year.

Related Articles

This Month in Datadog - December 2025

This Month in Datadog - December 2025

Highlights from AWS re:Invent 2024

Highlights from AWS re:Invent 2024

Track and alert on Amazon CloudWatch Network Monitor metrics with Datadog

Track and alert on Amazon CloudWatch Network Monitor metrics with Datadog

Highlights from AWS re:Invent 2023

Highlights from AWS re:Invent 2023

Start monitoring your metrics in minutes