Thanks to all who attended our second annual Dash conference! We hope that you enjoyed your time with us at New York City’s Chelsea Piers, and that you were able to learn about building and scaling systems and teams in our breakout sessions and workshops. For those of you who were unable to attend, we hope to see you next year. Check out some of the highlights from our two-day conference below.
Datadog co-founder and CTO Alexis Lê-Quôc kicked off the keynotes by explaining the driving forces behind Datadog’s new features: the desire to help users eliminate blind spots, get helpful context for troubleshooting, and seamlessly surface insights that would otherwise go unnoticed.
Alexis also reflected on the accelerated development and adoption of new technologies, likening this rapid evolution to a Cambrian explosion. He noted that as environments grow in scale, volatility, and complexity, observability also becomes more difficult—and yet, more crucial for keeping that complexity under control.
To address these observability challenges, Datadog partnered with several customers to design and develop expansions to the monitoring platform. During the keynotes, Datadog product managers announced new features and invited several customers—including Capital One, Hulu, and Thomson Reuters—to share their experiences as development partners or early beta users.
For the full roundup of the new Datadog products and features announced this year at Dash, please see our blog post here. Here are some highlights:
- Metrics without Limits™ and Tracing without Limits™: These two features allow customers to send all their metrics and traces to Datadog (and query them in real time)—yet only pay to index and retain their most important data. Using Metrics without Limits, customers can now send, aggregate, and manage custom metrics with unlimited cardinality tags, and choose to index only the metrics that they need. For Tracing without Limits, Datadog makes tail-based decisions at the end of the trace lifecycle, ensuring that teams have access to complete traces. Customers can decide which business-critical traces to index and retain for troubleshooting. For more details (and to enroll in the beta), please see our blog post.
Network Performance Monitoring: Inspired by our Service Map, Network Performance Monitoring provides multi-cloud visibility into the flow of network traffic across your environment. Filter by tags, from high-level indicators (such as availability zones) down to more granular criteria (like individual containers and processes). To learn more (and sign up for the beta), check out our blog post.
Metrics from Logs and Log Rehydration™: These two capabilities expand and enhance Datadog’s Logging without Limits™ feature set. The Metrics from Logs functionality allows users to build aggregated views of log data by creating a single metric to track log trends over time, while Log Rehydration™ enables you to quickly search for and retrieve archived logs. Take a look at our blog post for more information and to access the beta.
SLO Manager: An addition to our monitor uptime functionality, the SLO Manager allows you to see the status and error budget for all of your SLOs in one place. Plus, you can filter SLOs by various criteria, such as the team in charge, service, or time window. For more information—or to sign up for the beta—read our blog post.
Datadog had the privilege of hosting speakers from a wide range of organizations, including the BBC, Starbucks, and Comcast. Talks were divided into tracks such as Transformations, Performance, and Teams. Below, we’ll recap a handful of these talks, but you can check out the full list of videos here.
In the Transformations track, Betterment’s Sophia Russell (video) sketched out how her organization balanced innovation and compatibility by building advanced tools and systems, while maintaining support for legacy applications. Forrest Brazeal of Trek10 (video) conducted a deep dive into the human factor of serverless migrations, touched on the debate between refactoring versus rearchitecting, and shared effective strategies for getting buy-in on critical migration decisions from both colleagues and supervisors.
The Performance track included a talk from Datadog’s own Joel Barciauskas (video), who provided a behind-the-scenes view of how our metrics aggregation team captures and stores metrics quickly at scale. Zach McCormick described how Braze (video) used feature flags to ship new products and functionalities without accidentally breaking existing systems.
Bonnie Rhee from Flatiron Health (video) spoke about how her team overcame critical challenges to build AutoCrane, a Flask app for autoscaling their internal systems. Shopify’s Jason Hiltz-Laforge (video) shared lessons learned from running distributed development teams, and highlighted the importance of documenting everything, no matter how small it may seem.
At Dash, we hosted interactive workshops that gave attendees the chance to gain hands-on experience across multiple disciplines, from containers to serverless. These workshops included:
- Getting up and Running with Serverless: While serverless technologies allow organizations to reduce costs and latency while efficiently delivering new features, they also introduce new challenges for observability and operations. To get around these obstacles, Datadog’s Stephen Pinkerton walked attendees through the process of building a Lambda-based application on Node.js, demonstrating how to automate deployments and visualize serverless architectures with Datadog.
Ensuring Reliability with SLOs: In this workshop, Datadog and Google engineers demonstrated how to properly define SLOs to get deeper visibility into the end-user experience and to ensure reliability of dynamic infrastructure and applications. Participants conducted relevant exercises, including monitoring the right SLIs and using error budgets to respond to simulated chaos in a sample application.
Kubernetes Deep Dive: Catching and Preventing Failures: In this session, Datadog’s Compute Engineering team shared hard-won lessons from running production platforms and systems on Kubernetes. With the help of Datadog engineers, users instrumented a sandbox Kubernetes cluster, set up audit logging, and used Datadog log management to extract actionable insights from raw data.
Datadog 101: Led by the engineers who build, maintain, and support Datadog, this training session helped newcomers master key aspects of the Datadog platform by building useful dashboards, monitoring containers with Autodiscovery, and creating targeted alerts.
We were incredibly excited to meet members of the Datadog community at Dash 2019—both newcomers and returnees alike. Thanks to all of our speakers and attendees for joining us and for sharing your hard-earned knowledge and experiences along the way. We hope that you came away with practical skills to use in your own environments—and that you’ll join us for even more exciting speakers and experiences at next year’s Dash.