Lack of visibility limits scalability
Tyme, a Singapore-based multi-country digital banking group, focuses on serving consumers in emerging markets. TymeX, the product and technology development hub of Tyme, offers a scalable and highly replicable model for designing, building, and commercializing digital banks. In just a few years, TymeX has grown to support 14 million customers in South Africa and the Philippines, with plans to roll out its services in additional countries. 
In order to scale effectively, Minh Le, general director, TymeX, needed a reliable way to maintain system performance and availability to meet the needs of the company’s rapidly growing customer base and globally dispersed engineering teams. “Prior to moving to Datadog, we hit a snag of issues impacting bank availability,” says Le. “If your bank is offline, your transactions are offline, which affects our customers.”
TymeX has a complex, distributed banking system that integrates with several third-party vendors, including banking institutions and other software providers. Because some vendors are “black boxes” that do not provide access to their log data, Le’s team faced gaps in visibility and needed support from vendors when pinpointing the root cause of issues and remediating them. 
Additionally, TymeX’s platform team had to constantly context switch to investigate incidents. Hai Bui, engineering manager at TymeX, said remediating issues would take hours of manual work across his many teams and vendors. Fragmented views, lack of context, and siloed visibility often perpetuated failed card transactions. For example, the admin of one team would pass along the log ID to another team in order to manually correlate the data, leading to substantial delays for troubleshooting, poor customer experiences, and a significant opportunity cost for each failed transaction that added up to $120 billion globally each year. Another consequence of delayed troubleshooting stemmed from ISO 8583, an international standard for the communication flow of card transactions between services like VISA and Mastercard. If the end-to-end transaction took longer than six seconds, Tyme breached industry regulations and risked paying a fine.
Unlocking scalability with Datadog
In order to continue expanding into new markets and meet these challenges, Le needed a scalable, unified observability solution that would help quickly remediate failed transactions and be easily adopted by his growing engineering teams. Datadog was the answer.
Bui and his platform team used templates and automation to quickly set up monitoring with the help of Datadog, enabling fast operating procedures for any new country or service expansion. Advanced querying and analysis capabilities within Datadog Log Management helped them scale while maintaining the performance of Tyme’s backend systems. With Datadog, Tyme’s DevOps teams could further drill into log patterns using Pattern Inspector, which allowed them to see the distribution of values within each pattern associated with the transaction ID.
For extremely time-sensitive incidents, Tyme can also quickly troubleshoot with one click using Log Anomaly Detection powered by Watchdog AI, which automatically detects changes in log patterns.
In addition to correlating logs, traces, and metrics, Bui and his team also benefit from the intuitive navigation and advanced querying capabilities of Datadog without having to be log experts. With one click in Datadog Log Explorer, Bui can see groupings of log events based on their transaction ID and drill into more detail. “I really like the Patterns feature—which allows me to see groups of log events based on similar messages—because it lets me prioritize and prevent potential issues in the future,” says Bui.
Furthermore, Bui and his team have built custom dashboards broken out by error type, transaction ID, and transaction type that they use to not only monitor the health of their system but also to proactively identify and prioritize issues.
Accelerating digital payments in a global market
Today, Datadog helps Bui and his teams easily integrate and collect data from third-party vendors, allowing for better visibility into different parts of its distributed system. For example, some of its vendors have unformatted raw data, which is automatically collected and converted into logs by the Datadog Agent. With a single unified platform for faster troubleshooting, TymeX has reduced its Mean Time to Identify (MTTI) from one hour to one minute. 
Bui can now easily pivot from logs to traces within Datadog based solely on the transaction ID. While it previously took hours of manual work to understand the root cause of failed transactions, Bui and his team can now troubleshoot in minutes, resulting in better systems performance and a 35 percent drop in customer service call volume.
Datadog has now been adopted by over 400 engineers across multiple teams within TymeX, enabling the organization to expand their footprint into more countries and grow the company while building strong customer loyalty. “With Datadog’s help, we’re saving time by sharing our learnings with other teams,” says Le. “We’re rolling out to more countries, and it’s just a lift and shift with minimal change.”