Mark Chodos, SRE at OpenX, noted that "once we completed the move we started working on making sure we had good operational visibility into the cloud environment and there was a lot of focus on optimization of workloads from a performance and cost perspective. One of the things that we certainly had in the on-premise infrastructure was significant instrumentation of the network traffic. However, in the public cloud with the network abstracted from us, that visibility went away and we need to find ways to regain it."
Network reliability and performance and network bandwidth costs emerged as business critical issues in the new environment. Mark noted that "if there is a latency increase or connection failures or retransmissions to an advertising partner, it can really impact revenue. It's critical that we can troubleshoot issues like this." In addition to debugging problems, OpenX also discovered that one of their largest costs is network egress to different advertising partners. They needed a means of measuring, analyzing, and optimizing that cost to improve profitability.
Flowmill integrated seamlessly into OpenX's Kubernetes environment. Mark noted that the setup process has "been incredibly easy. Download a helm chart and deploy it. In a matter of minutes, we had Flowmill up and running." The results were equally impressive. Shortly after deployment, Mark and his team identified "misconfigurations causing a high rate of DNS errors between services, a misconfiguration we traced back to our transition to GCP and did not have visibility into."
OpenX used Flowmill to "drill down to connections between our service and an advertising partner to look at what is going on, including connection failures, packet loss, changes in round trip time." Flowmill also allowed OpenX to "to analyze network traffic on a per demand-side partner [DSP] basis to compare what we are spending with what we are making on the relationship."
Today, Flowmill monitors well over 1000 cloud instances in 6 public cloud regions. It is used by the SRE team and support organization to troubleshoot issues as well as the finance team to analyze cloud computing costs and advertising partner profitability.