Monitoring Kubernetes Networking with eBPF

Posted April 15, 2020 in

x

Anyone who has operated a Kubernetes cluster knows there are many places where things can and will go wrong. While a lot of attention is frequently paid to failures between services or within nodes, one of the most critical ingredients in a Kubernetes cluster, the network, is one of the most opaque and frequently overlooked. An unreliable network or DNS subsystem can manifest as a wide range of ostensibly random application-level failures that affect numerous services. Spikes in network latency can slow down API calls and unintended traffic patterns can drive up operational costs.

SREs operating Kubernetes clusters need a response to the questions: Is something wrong in the cluster’s network? What services are affected by it?

Since they capture application layer data, the existing approaches to observability, metrics, logs, and traces, really don’t give us a sufficient set of tools to understand the network. Luckily, the Linux kernel has a ton of per-container network data and with eBPF, the extended Berkeley Packet Filter, its possible to efficiently collect and analyze it. eBPF, for those unfamiliar with it, offers an interface to run native just-in-time compiled code that can access a subset of kernel functions and memory with the kernel providing a secure execution environment. With well-written eBPF programs, it’s possible to efficiently collect network information on every connection from every container and join it with data from Kubernetes. Not only will this create a picture of how services and the network beneath them behave, but it can also be done completely transparently. This means it requires no changes to the container runtimes or application code, no new CNI plugin, and no additional sidecar proxies. It also means that it is a true depiction of the behavior seen by your application rather than measured from an external location.

Let’s take a look at how this works. From Linux, it’s possible to collect data on every socket that is created by every pod and associate with a process and IP address (and sometimes DNS address as well). After accounting for some complexities such as Network Address Translation (used between services), this information can be joined with Kubernetes metadata about pods, images, and tags as well as location-based data like zone (or rack). In the example below, a view of traffic between the “frontend” and “checkout” service is created by merging Kubernetes metadata with data from the Linux network stack about a connection between these pods.

x

So, where can this approach help you improve network observability in a Kubernetes environment?

Identifying network reliability and latency problems affecting services

One of the first questions that often arises in troubleshooting elevated API error rates or performance issues is whether the network could be at fault. If the network is the problem after all, why send the development team on a wild goose chase looking for a problem in the code that doesn’t exist. Critically, by measuring network behavior from within the operating system with eBPF, it’s possible to tell how services and containers are impacted rather than just getting coarse-grained metrics per Kubernetes node. [Demo]

Measuring network bandwidth costs

Depending on how pods are distributed across Kubernetes nodes, services may become large and expensive consumers of network bandwidth. This can be a significant problem in cloud environments where bandwidth directly translates to network costs. In AWS for example, data transfer across availability zones is $0.01 / GB (both in and out) while transfers out to the Internet is $0.09 / GB. These prices vary across the world and across cloud providers but generally follow this pattern. Only by knowing which services generated the most traffic is it possible to optimize the cluster. [Demo]

Assessing the reliability of cluster DNS

In a Kubernetes cluster, DNS is an absolutely critical core service and it has numerous points of failure (in application libraries, kube-proxy, coreDNS, upstream DNS). DNS has become a popular player in many of the Kubernetes failure stories and most recently, the folks at Robinhood explicitly called out “failure of our DNS system” in their post mortem. It’s critical to measure DNS reliability from the perspective of your applications so you know exactly where and when problems are occurring. With eBPF, it’s possible to gather this data from Linux and know which pods and services are impacted by DNS errors or increased DNS latency. [Demo]

Hopefully, these examples give you some perspective on why it’s so critical to improve network visibility into Kubernetes clusters and how an eBPF-based approach can help. If you are interested in learning more, we’d love to hear from you.

[And credits to my co-founder Jonathan Perry whose Kubecon talk inspired this post.]