Skai Turns to Datadog to Track Mission-Critical Phases of Its AWS Migration | Datadog
CASE STUDY

Skai turns to Datadog to track mission-critical phases of its AWS migration

about Skai

Skai (formerly Kenshoo) is a leading omnichannel marketing platform that uniquely connects data and performance media for informed decisions, high efficiencies, and optimal returns.

Software
 Development

500–1,000
 Employees

Tel Aviv, Israel

“Datadog’s UI and ecosystem enabled our platform and development teams to be more data-driven and create dashboards, use integrations, alert on issues, and show nice visuals of different aspects of applications.”

case-studies/skai_headshot
Erez Lotan
Chief Data Officer
Skai
case-studies/skai_headshot

“Datadog’s UI and ecosystem enabled our platform and development teams to be more data-driven and create dashboards, use integrations, alert on issues, and show nice visuals of different aspects of applications.”

Erez Lotan
Chief Data Officer
Skai
Why Datadog?
  • Extensive AWS monitoring capabilities for cloud migration, including breadth and depth of integrations with core AWS services
  • Provides comprehensive visibility into on-prem and AWS environments, ensuring key applications that were being migrated maintained high availability 
  • Brings relevant metrics into a single, unified view without additional development effort, complemented by alerting features such as anomaly detection and change monitors
Challenge

Skai was migrating its on-prem applications to the cloud and found its previous monitoring tools insufficient. It was looking for a reliable solution that would track performance metrics and immediately alert teams to any issues.

Use case

AWS Integrations

Alerts

Application Performance Monitoring

Custom Metrics

Log Management

Error Tracking

Key Results
100% of cloud migration completed

All monolithic applications moved to the cloud

$800K annual savings

$500K from immediate alert savings and $300K for reducing throughput peaks

↑40% faster developer velocity

Increasing frequency of code releases

4x more attribution events processed

By moving to cloud-native Amazon Keyspaces (for Apache Cassandra)

Improving visibility during a cloud migration

Skai manages marketing channels for some of the world’s largest and most well-known brands. Doing so requires the company to process vast amounts of data across multiple systems. As the company has grown over the years, its IT complexity has also grown exponentially. Today, Skai runs at Internet scale, with more than $7 billion of ads run on its platform annually.

The company previously used five different observability tools, which proved to be complicated and cumbersome. As a result, it took Skai’s development teams too long to ensure things were running right. They were also in constant firefighting mode, which was bad for their clients and the business. The company’s Production Engineering and Infrastructure team, which consists of Genadi Tsvik, director of Production Engineering, Erez Lotan, chief data officer, and Danny Zalkind, senior director of Infrastructure Engineering, wanted to work together to tackle these challenges.

At the same time, the team sought to migrate to the cloud to enable modernization, lower total cost of ownership, and improve efficiency and cost optimization while hitting SLA goals. Migrating to the cloud would also enable its R&D organization to move fast and develop more features. But moving from on-prem to the cloud would be risky for the company, and any performance degradation was a deal breaker. To ensure a successful migration, the Skai team wanted to use multiple dimensions of metrics to monitor performance changes during the process and catch any issues.

skai_product.png

Maintaining a high level of availability during migration

The Skai team selected Datadog to replace its existing observability tools. The team valued Datadog’s unified view, clean user interface, and easy-to-use dashboards. “We are a data company, and we like to be data driven,” says Lotan. “Datadog’s UI and ecosystem enabled our platform and development teams to be more data driven and create dashboards, use integrations, create alerting, and show nice visuals of different aspects of applications.”

The Skai team also found Datadog’s extensive AWS monitoring capabilities for its cloud migration initiative especially useful, including its integrations with core AWS services. A big part of Skai’s cloud migration process involved moving workloads to AWS-managed services like Amazon RDS, Amazon ElastiCache for Redis, and Amazon Keyspaces (for Apache Cassandra).

Thanks to Datadog’s AWS integrations, the Skai team had comprehensive visibility into both their on-prem and AWS environments, making sure they maintained service availability. “Datadog allows you to find what you’re looking for with little training and high efficiency, and have the confidence that everything is being monitored,” says Lotan. “If something is not working correctly, you will be notified and you have the tools to investigate.”

Reducing throughput peaks leads to savings

Operating in the cloud provides opportunities for efficiency and cost savings. Skai was able to realize these and achieve significant savings by using Datadog to track mission-critical phases of its migration and modernization efforts. For example, when the team used to leverage on-prem Apache Cassandra, the service was not sensitive to read and write throughput at all. After moving to cloud-native Amazon Keyspaces, they were able to select one of two throughput capacity modes for reads and writes: on-demand and provisioned.

By using Datadog to monitor performance and resource usage of their application as they shifted on AWS, the Skai team was able to accurately measure their actual capacity needs and make data-driven design choices. “We initially opted for on-demand capacity mode with a pricing goal of achieving an 80 percent savings from the initial on-demand cost,” says Tsvik. “The Datadog Amazon Keyspaces integration effortlessly displays relevant metrics, requiring no additional development effort. With metrics and alerting features like anomaly detection and change alerts monitors, we confidently fine-tuned the cluster and swiftly transitioned to provisioned capacity mode. Datadog's Application Performance Monitoring (APM) of Amazon Keyspaces clients reduced throughput peaks, contributing to an annual cost savings of $300,000.”

Additionally, the Skai team can now monitor parts of the business they weren’t able to before. For example, Datadog Error Tracking enables monitors to be set on error tracking events, grouping similar errors into issues, tracking issues over time, and providing comprehensive context for troubleshooting. “Immediate error alerts from Datadog, integrated with our 24/7 Network Operations Center team, have proven invaluable, helping us catch critical issues and preventing budgetary problems for our customers,” says Zalkind. “We estimated potential refund savings of $500,000 in the last year thanks to this Datadog feature.”

“Datadog allows you to find what you’re looking for with little training and high efficiency and have the confidence that everything is being monitored. If something is not working correctly, you will be notified and you have the tools to investigate.”

Skai’s cloud migration for its monolithic applications was successful—leading to 40 percent faster development velocity and four times more attribution events being processed. Datadog’s AWS monitoring capabilities have also given the Skai engineering team confidence that they have clear visibility into every aspect of their services. The company is now moving its main real-time analytics database, SingleStore, to the cloud using built-in Datadog integrations and dashboards to monitor its progress.

In the future, Skai will also use AWS and Datadog to help it deal with bottlenecks in its platform by using Datadog's APM module to identify hotspots in its microservices and backend services. Further, Skai plans to enhance the developer experience and improve troubleshooting of technical issues using Datadog's Dynamic Instrumentation capabilities to allow R&D to improve debugging during runtime, saving developers a considerable amount of time.

Resources

case-studies/resources_berkeley-lab_casestudy

case study

Materials Project of Berkeley Lab Uses Datadog Cloud Monitoring to Simplify Observability on AWS
case-studies/kevel_thumbnail

case study

Kevel enables engineers to understand and reduce cloud costs with Datadog Cloud Cost Management
case-studies/resources_toyota_casestudy

case study

Toyota accelerates feature delivery, troubleshooting, and onboarding at scale by monitoring its AWS environment with Datadog