Highlights From KubeCon + CloudNativeCon 2022 | Datadog

Highlights from KubeCon + CloudNativeCon 2022

Author Ara Pulido

Published: May 26, 2022

After two years of virtual editions, KubeCon + CloudNativeCon Europe returned as a hybrid event, with its in-person portion held in Valencia, Spain, from May 16-20.

As platinum sponsors of this year’s conference, Datadog held a booth where we showcased the latest updates to our Kubernetes monitoring solution, including the new Kubernetes resources overview, improved OpenTelemetry support, and the latest version of the Datadog Operator for Kubernetes.

Aside from the booth, some of our engineers delivered talks in the different tracks. Laurent Bernaille shared our experience adopting Cilium in the Maintainer track, and in the Networking track, Laurent and Elijah Andrews shared their findings on a difficult-to-debug incident at Datadog that felt like a DNS problem but turned out to be something else. Finally, Tabitha Sable gave an update on SIG Security as part of the Maintainer track.

But the best thing about being at KubeCon Europe in-person again was the possibility to attend other talks, visit other booths, and most importantly, talk to different people in the community to share knowledge and understand what the ecosystem is heading to. In this post, we’ll share some of our favorite speaker highlights from this past week at Kubecon + CloudNativeCon Europe 2022.

Software supply chain security

One theme we heard over and over again during the week wasn’t primarily about Kubernetes or even specific to cloud-native technologies. Instead, it was about modern software delivery and, more broadly, software supply chain security. The SolarWinds and Codecov supply chain attacks and more recent Log4j vulnerability have raised awareness of the need for better strategies and software solutions to mitigate the effects of increasing software architecture complexity.

During his keynote, infrastructure security engineer Shane Lawrence from Shopify clearly exposed how important software supply chain security is for modern businesses. In the same way that the physical supply chain crisis doesn’t mean we should be building factories for all components ourselves (we lack the experience or the needed knowledge), Lawrence said we shouldn’t stop using third-party software to build our services; rather, we should understand the risks involved and how to mitigate them.

Outside the keynotes, there were several sessions explaining strategies and tooling on how to properly sign images, obtain software bill of materials (SBOM), and perform frequent vulnerability scanning. As an industry, we need to acknowledge the issue of software supply chain security and provide solutions and best practices. You can learn more about how Datadog signs the Datadog Agent and all its integrations in our blog.

Platform engineering

Another big topic during KubeCon was the concept of platform engineering. As software gets more complex and Kubernetes enters the enterprise, the idea of Kubernetes being a “platform for platforms” is becoming more and more ubiquitous.

To improve security and reliability, more teams are “shifting left,” making developers more aware of—and responsible for—those concerns earlier in the development process. However, developers need to continue to be productive and ship code, which brings us to the concept of platform engineering. As developers are increasingly responsible for more pieces of the stack, old DevOps and SRE teams are starting to focus on building internal platforms on top of Kubernetes. In many cases, these internal platforms are known as Golden Paths, where platform engineers make opinionated decisions on tooling and workflows and then build a developer self-service platform around it. This enables development teams to deliver new features faster while making sure systems are secure and observable, and that they are using infrastructure that has been tested for their use case by the platform teams. They also have the flexibility to get outside these Golden Paths if needed but lose the support of platform engineers.

Ben Hale, technical lead with VMware Tanzu, gave a keynote session explaining the concepts of a successful PlatformOps team, and Jens Erat, Peter Mueller, and Sabine Wolz from Mercedes-Benz explained their journey from traditional operations to creating an internal development platform on top of Kubernetes.

Outside the keynotes, Daniel Bryant, head of developer relations from Ambassador Labs, gave a full talk on platform engineering, reinforcing the idea that in order to shift responsibilities to the left, developers need guidance and support for this type of platform team, and that internal platforms need to be treated as products.

It seems that Kubernetes has unlocked the next step in the DevOps revolution, where traditional operations teams are now platform teams delivering an internal product: a developer user experience on top of Kubernetes.

Autoscaling, bin packing, and FinOps

Now that a lot of different companies have moved their workloads to Kubernetes, the next questions in the room are: how do I make the most of my computing resources? How can I properly keep my cloud costs accounted when running in Kubernetes?

There were several talks at KubeCon related to autoscaling in Kubernetes, both around horizontal pod autoscaling and cluster autoscaling. But the most interesting ones were on how to improve resource usage efficiency with better resource allocation, and how that relates to cloud cost management. Vincent Sevel, technical architect from Lombard Odier SA, shared how they used the Vertical Pod Autoscaler to improve resource allocation for their pods. On the FinOps side, Vanessa Kantner and Manuela Latz from Liquid Reply gave some practical tips on how engineers can manage cloud cost accountability when running on Kubernetes clusters.

Datadog helps customers reduce their Kubernetes cloud costs by providing the needed metrics to make bin packing and autoscaling decisions. You can read more about autoscaling your Kubernetes workloads with Datadog metrics in our blog post.

Unconventional use cases

Outside traditional stateless workloads, Kubernetes adoption is growing in verticals where workloads require specific hardware and more complex scheduling algorithms, like machine learning (ML), Big Data, or network functions virtualization (NFV).

These use cases may fall outside of what the default Kubernetes scheduler covers, and may require hacking a custom scheduler to replace the default scheduler. Several custom schedulers were presented during KubeCon, including the Telemetry Aware Scheduler, a custom open source scheduler that uses telemetry data to drive scheduling decisions, presented by Madalina Lazar and Denisio Togashi from Intel; and KubeFlux, a high-performance computing (HPC) scheduler, presented by Claudia Misale from IBM and Daniel Milroy from the Lawrence Livermore National Laboratory.

Telecommunications is another vertical Kubernetes is entering. These environments are traditionally not cloud native, and teams are building networking solutions to bridge the gaps. Christopher Dziomba and Marcel Fest from Deutsche Telekom shared how they are building a platform on top of Kubernetes for their 5G workloads.

Finally, Big Data and ML are other types of unconventional workloads that are now being deployed to Kubernetes. Bowen Li and Huichao Zhao from Apple shared (virtually) how they are running Apache Spark on Kubernetes, including details about how they autoscale their Spark clusters. Holden Karau, open source engineer from Netflix, explained how Apache Spark was connected to Kubeflow to diagnose COVID-19 based on CT scans.

A vibrant ecosystem and the road ahead

Although this list of key topics at Kubecon + CloudNativeCon Europe 2022 is not exhaustive, it shows that the Kubernetes and cloud-native ecosystem is as healthy as ever, and that it continues to evolve as more verticals and big enterprises make the transition. Datadog will continue to participate in KubeCon + CloudNativeCon and their ecosystem to ensure that we remain the best tool for monitoring your Kubernetes clusters—both on-prem and in the cloud—and their increasingly diverse workloads.