Google Cloud Monitoring | Datadog

Google Cloud Monitoring

Evernote logo project44 logo

Thousands of customers love & trust Datadog

Introduction

Enterprises and startups face mounting pressure to deliver fast, reliable, and differentiated user experiences. Google Cloud provides a powerful foundation for these modern applications, but the complexity of cloud-native environments can mean new challenges: failed migrations, blind spots in monitoring, slow incident response, security risks, and escalating costs. Enter Datadog – the industry-leading observability and security platform that enables Google Cloud users to build, migrate, and modernize applications on Google Cloud.

Monitor Google Cloud, Hybrid, and Multi-Cloud Environments End to End

Organizations running multi-project Google Cloud, hybrid, and multi-cloud environments often struggle to get complete visibility across their environment, either accepting blind spots or tool sprawl. Datadog solves this with unified, end-to-end monitoring for Google Cloud, public clouds, and on-prem infrastructure with our open source agent, 1000+ vendor-backed integrations (including 35+ for Google Cloud), and OpenTelemetry support. This data is used to power Datadog’s native support for Google Cloud services like GCE, GKE, Cloud Run, Vertex AI, BigQuery, and Cloud SQL across its extensive product portfolio, giving teams the tools they need to troubleshoot, optimize, and secure their entire stack in one place.

A billion-dollar travel technology provider consolidated five monitoring tools into one with Datadog, cutting observability spend by $5.2M.

Customer Example: A billion-dollar travel technology provider consolidated five monitoring tools into one with Datadog, cutting observability spend by $5.2M.

De-Risk and Accelerate Migrations to Google Cloud

Google Cloud offers significant benefits, but organizations migrating to the cloud often find themselves hindered by unforeseen errors, security issues, cost overruns, and an inability to prove migration success, leading to significant delays and, in some cases, abandoned migrations. Datadog gives organizations the tools they need at each stage of cloud migration to ensure migrations run smoothly and meet expectations. Compare performance of pre- and post-migration workloads, identify previously unknown dependencies during planning, and manage security, cost, and compliance throughout the migration process to stay aligned with organizational requirements.

A developer tools platform used to improve applications for billions used Datadog to migrate to Google Cloud without any disruption to its internal teams and its end customers.

Troubleshoot and Resolve Issues Faster

Datadog simplifies incident response with extensive, correlated telemetry spanning infrastructure, applications, databases, frontend, LLMs, and more. This comprehensive context accelerates investigations, reduces mean time to resolution, and improves reliability. Fifteen months of high-resolution metrics mean teams have the data they need to perform seasonal benchmarking and historical analysis.

When incidents occur, On-Call pages the teams that need to begin troubleshooting, and configurable alerting criteria reduce alert fatigue and ensure teams are only notified when needed. In-platform Incident Response tools simplify investigations, while automation tools detect anomalous behavior, perform root cause analysis, and automate remediation.

project44 leveraged Datadog to cut the number of incidents in its Google Cloud environment and reduced MTTR by ~60%.

Monitor network throughput and CPU utilization for all hosts across availability zones.

Optimize Performance and Costs

Cloud migrations and replatforming often lead to rising costs without clear accountability. Datadog correlates cost and performance data together in one platform, giving teams visibility into cost drivers and optimization opportunities. Cloud Cost Management (CCM) uses this data to help teams make informed cost optimization decisions that don’t unduly impact performance, while Kubernetes Autoscaling allows teams to act on optimization recommendations directly within Datadog. Additional capabilities like Continuous Profiler, Database Monitoring, and Data Observability identify inefficient code, SQL queries, and BigQuery queries for further performance and cost optimization.

Datadog also gives teams the tools to optimize their observability costs with flexible data processing rules like Flex Logs and the ability to analyze Datadog costs for free in CCM. A Private Service Connect integration and the use of Google’s Active Metrics APIs—which have reduced API costs associated with the Google Cloud integration by ~75%—further optimize costs for Google Cloud customers.

Salling Group used Datadog Cloud Cost Management to save $250k+ in annual cloud spend across its multi-cloud infrastructure (including Google Cloud).

Secure Your Full Stack and Manage Compliance

Tool sprawl, alert fatigue, and unclear ownership across DevSecOps teams limit organizations’ abilities to effectively secure their environments. Datadog brings security and observability together in one platform, enabling teams to quickly detect, prioritize, and resolve security and compliance issues across Google Cloud, hybrid, and multi-cloud environments. Users can proactively detect security issues from code to cloud, and investigate and respond to active threats with enriched security findings from Datadog and Google Cloud via integrations with Security Command Center and Cloud Armor. Datadog is also ready to meet enterprise-grade compliance requirements, maintaining compliance with key standards, offering international hosting options, and giving teams the tools to manage Datadog access and track usage.

A $100M HCIT company uses Datadog Cloud SIEM to detect and prioritize threats in Google Cloud with out-of-the-box detection rules and visualizations.

Monitor network throughput and CPU utilization for all hosts across availability zones.

Operationalize Observability and Security at Scale

As organizations grow, manual work can become untenable and best practices harder to maintain, which risks slowing teams down and yielding lower quality user experiences. Datadog scales alongside its users, giving them resources to reduce manual burden, complete their work faster, and standardize best practices at scale. Vendor-backed integrations, dashboards, and monitors simplify setup and maintenance, minimizing time spent on tool upkeep. The Internal Developer Portal captures runbooks, scorecards, and SLOs, enabling users to onboard new developers faster and maintain consistent best practices. And Workflow Automation and Bits AI reduce investigation overhead by codifying playbooks, automating repetitive tasks, and providing root cause analysis, freeing teams to spend time elsewhere.

Forbes set up Datadog 4x faster than its previous tool using vendor-backed integrations, helping them manage the demand of millions of readers, cut GKE spend by 33%, and improve homepage load time by 37%.

Deploy Gen AI Capabilities with Confidence

As organizations bring AI applications into production, they face new risks around performance, cost, and reliability. With Datadog, users can track costs, latency, and quality of Gemini and Vertex AI deployed LLMs with LLM Observability and experiment with prompts and models to optimize the performance of their LLM applications. GPU Monitoring provides insight into GPU utilization and bottlenecks, and 50+ vendor-backed AI/ML integrations – including Vertex AI and Cloud TPU – round out Datadog’s comprehensive visibility across the AI stack, helping teams identify and resolve issues wherever they are.

An international home decor retailer uses Datadog LLM Observability to monitor Gemini-powered AI search and chatbot capabilities.

Monitor network throughput and CPU utilization for all hosts across availability zones.