At the Datadog Summit, you will meet and learn from fellow community members, contributors, and Datadog staff. You'll get the latest product updates, learn how your fellow community are using Datadog to build cultures of observability, and participate in open discussions and guided training sessions. In the hands-on training, we'll walk through best practices for building better dashboards and alerts, developing custom agent integrations, and monitoring and instrumenting your containerized applications.
Datadog Welcome (Ilan Rabinovitch)
Keynote (Alexis Lê-Quôc)
Next Generation Datadog Agent (Greg Meyer)
First Look: Datadog Logs! (Renaud Boutet)
Predictive Alerting (Homin Lee)
From Monolith to Microservices at Caviar (Walter King)
Workshop: Datadog 101
Workshop: Hands on with APM and Distributed Tracing
Monitoring as Code at Atlassian (Brendan Shaklovitz)
Monitoring and managing Apache Cassandra (Joaquin Casares)
We're going to need a bigger barge: Advances in Container Monitoring (Michael Gerstenhaber)
Managing Docker and Mesos at HomeAway (Alan Scherger)
Workshop: Containers and Kubernetes Monitoring Best Practices
Showcasing Observability to Supervisors at Castlight Health (Cædman Oakley)
Closing (Ilan Rabinovitch)
Alan is a senior janitor (aka Cloud Platform Engineer) at HomeAway, where he’s built out an internal, multicloud platform as a service. He’s spent most of this year learning Golang and practicing hate-driven development with small contributions to open source.
Managing Docker and Mesos at HomeAway
Alexis is CTO and co-founder at Datadog, where he brings a strong focus on technical elegance and operational efficiency. Prior to founding Datadog, Alexis Lê-Quôc served as the Director of Operations for Wireless Generation where he built the team and infrastructure that served more than 4 million students in 49 states. As a member of the original 'devops' movement, Alexis spent several years as a software engineer at IBM Research, Neomeo and Orange.
Ilan Rabinovitch is director of product management and technical community at Datadog. Previously, Ilan spent a number of years leading infrastructure and reliability engineering teams at organizations such as Ooyala and Edmunds.com. He’s active in the open source and DevOps communities, where he is a co-organizer of events such as SCALE, Texas Linux Fest, and various DevOps Days events.
Homin Lee is lead data scientist for Datadog, where he writes algorithms that process trillions of data points a day. Prior to Datadog, Homin built large-scale machine-learning systems at several startups. Homin has a PhD from Columbia University in computational learning theory and was a Computing Innovation Fellow at the University of Texas at Austin.
Session: Automated alerts are essential to monitoring. In this session, hear about how you can use Datadog's algorithmic monitors to alert on problems that require attenion immediately. We'll cover cases where thresholds are breached, metrics behave abnormally, and even when serious symptoms are predicted to develop soon.
Michael Gerstenhaber is a product manager at Datadog, building products for Process and Container monitoring, and working with our IoT partners to design solutions for monitoring smart devices. Michael was previously an engineer at Cisco Systems where he contributed to a series of network and data center management tools, and later made a stopover in the video game industry, serving as director of product at Happy Cloud.
Advances in Container Monitoring
Matt Williams is a Technical Evangelist on Datadog's community team. He is passionate about the power of monitoring and metrics to make large-scale systems stable and manageable. So he tours the country speaking and writing about monitoring with Datadog. When he's not on the road, he's coding. You can find Matt on Twitter at @Technovangelist.
Workshop: Datadog 101
Jason is a technical writer and evangelist at Datadog, where he works to inspire developers and ops engineers with the power of metrics and monitoring. He’s also a co-organizer of DevOpsDays Portland. When he’s not speaking at conferences or helping organize them, he likes to spend time on planes “travel hacking” and hunting for interesting, regional whiskey.
Brendan Shaklovitz has spent the last 3 years breaking things (on purpose). As a site reliability engineer focusing on improving service resilience, he has worked to simulate failures, automate incident processes, and implement new ways for teams to monitor and report on their services. He is currently working for Atlassian where he helps improve and maintain collaborative tools used by companies of all sizes. When he's not finding new ways to break services, he enjoys finding and fixing security bugs.
Session: Time is money and money is power. Why would you spend all your time, money, and power creating dashboards and monitors for your services, fixing them when someone else breaks them, and updating them at scale? You wouldn't. You'd use Terraform instead.
Monitoring as Code at Atlassian
Joaquin Casares started his career as the 9th employee of DataStax. For 2 years he handled all 200+ customers' support tickets as a Support Engineer. He trained a team of 5 as he moved into a Software Engineer-in-Test position to test and validate the Java, Python, and C# DataStax Drivers for Apache Cassandra. On-ramping a team of 11, he moved into a Demo Engineer position to help bridge the gap between Engineering, Marketing, and Sales. After DataStax, Joaquin moved to Umbel to overhaul a 5-year Cassandra project as part of a 3-person Architecture team. Within 4 months, workloads were running at 30% of their original time with 50% fewer workers. Now at The Last Pickle, a Cassandra consulting company, Joaquin continues to take on Operational Support duties for geo-distributed clients including a Tier 1 telco, global SaaS infrastructure, and sites with 100s of millions of users.
Session: Monitoring and management of distributed systems and datastores such Cassandra brings a different set of challenges than monitoring traditional systems. We no longer have the luxury of maintaining and grooming unique individual snowflakes, but instead must ensure consistency, and cull outliers from our clusters. In this session we will dive into a new set of monitoring best practices for Cassandra with Datadog developed by The Last Pickle team. Lessons and tooling that will enable you to successfully scale C* will be shared. Though this session is specific to Cassandra, users of other distributed systems should find valueable lessions and best practices as well.
Monitoring and managing Apache Cassandra
I revel in solving the soft problems in the tech space. Motivating teams and bringing DevOps culture to light the dark corners of software delivery. Currently I am an Engineering Leader using DevOps methods and collaboration to generate crossfunctional productivity. My passion is unblocking and keeping teams motivated to develop new and stable product on-time and on-budget. With over 20 years of Industry experience including Quality Assurance, Development, Release Management, and Application Architecture
Session: Monitoring and metrics are often the focus of IT teams as we look to identify issues and troubleshoot incidents. But how we can expand our efforts and bring this data driven approach to our wider organizations? In this session the Castlight Health team will discuss their transition from monitoring to observability. Learn how their journey towards observability has allowed teams to move from reactive to proactive, reduce costs, and drive data based conversations across traditional business and IT silos.
Showcasing Observability to Supervisors
From Monolith to Microservices
Renaud is a product director at Datadog focused on log management. Prior to joining Datadog, Renaud was co-founder & Chief Product Officer at Logmatic.io (acquired by Datadog). Prior to Logmatic.io, Renaud lead development of high performance Business Intelligence solutions for financial institutions.
Session: Get a first look at the upcoming release of Datadog Logs and our wider vision for a unified monitoring platform. Attendees will have an opportunity to dive into how the new logs offering integrates with the wider Datadog ecosystem, from metrics to traces and events.
First Look: Datadog Logs!
Greg is a Software Engineer at Datadog. He works on the Agent and Integrations. He lives in Brooklyn. And he likes using Linux on his laptops, because he's a masochist.
Next Generation Datadog Agent
Software Engineer and Open Source lover, Emanuele works on Datadog’s Application Performance Monitoring team. While playing with highly scalable systems, he improves and simplifies how a distributed system is traced across different languages. Without losing correctness, he likes writing efficient, readable and maintainable Python and Go code because elegance is not optional.
Hands on with APM and Distributed Tracing