The Extended Berkeley Packet Filter (eBPF) Story so Far: a Remarkable Evolution into a Versatile Workhorse

Overview🍃

Since we started working with groundcover last year, we were intrigued by just how it was able to gather such deep insights into application behaviour with such low overhead. This is something we’ve covered previously, but we didn’t get the chance to explore exactly how eBPF works or where it came from. Today’s blog post chronicles our journey into understanding how eBPF and how its breakthroughs came to be.

What is eBPF?

eBPF has emerged as a powerful tool for low-overhead, deep observability in containerised and cloud-native environments. Even more recently, it has become an essential tool in Kubernetes networking and security in a way that no other mechanism has achieved before. Read on to find out how this is possible and how to leverage its incredible versatility.

The origins of eBPF

Comic by Philipp Meier and Thomas Graf, Source : ebpf.io

Picture this: it’s the late 2000s, and the traffic on the web has exploded – driven by developing nations coming online and the nascent smartphone era. A new generation of cloud-native Web 2.0 companies appeared and took advantage of this period of unprecedented connectivity. In doing so, they drove a concordant explosion of data centre and cloud computing demands. Coinciding with these heady early days of the cloud was the mass adoption of Linux-based virtual machines by enterprises – as it became supported and stable enough thanks, in part, to the efforts of companies such as Red Hat and VMWare. So we find ourselves in a position of needing to network together a huge number of virtual machines in sprawling data centres where the tangle of ethernet-based hardware switching had become untenable.

This led to the adoption of Software Defined Networking (SDN). Whilst not the main focus here, SDN was the precursor to the uses of eBPF we see today. Like many of the transitions the cloud has triggered, it’s a prime example of a hardware problem begetting a software solution. Essentially, how do you move packets around, at speed, between thousands of VMs running on hundreds of hosts in data centres?

The Berkeley Packet Filter (BPF) was a key enabler of SDN. BPF was developed in the early 1990s as a VM within the kernel, designed to execute user-defined packet filtering programs. If you’ve ever used iptables or tcpdump on a Linux machine, then you’re actually invoking code that gets compiled into BPF bytecode to be executed in the kernel space to achieve your demands:

iptables -A INPUT -s 192.168.1.0/24 -j DROP

tcpdump -i eth0 "tcp port 80"

It makes sense to have these execute at the kernel level instead of the application level, because it is much closer to where their commands will actually have an effect.

Manipulating networking with BPF commands like this has long been the mainstay of Linux operators in fits of debugging frustration. However, the instruction set available was always limited to basic networking intervention.

Fast forward to the mid-2010s, and there’s now a building conflict between the deliberate move to make applications independent from each other (using containers and pods) yet also the need to have a common platform for instrumentation between them. This is where the work of Alexei Starovoitov shines, and along came the idea to expand the capabilities of BPF significantly. eBPF introduced a new instruction set to the Linux kernel, supporting a wider range of data structures and allowing programs to be attached to kernel events and hooks. These enhancements laid the foundation for eBPF’s application to observability. In effect, it enabled programmability in a space that was previously an immutable fortress.

eBPF today

The advent of eBPF has spawned multiple companies and open-source projects intent on bringing its benefits to the Linux-based world at large:

groundcover : a cloud-native eBPF observability platform with a strong focus on usability and clear presentation. Crucially, it plugs gaps present when using eBPF alone, as it can use its comprehensive overview to show distributed tracing.
OpsCruise(Now part of Virtana) : another innovative observability solution using eBPF and Open Telemetry together, with a key differentiator being automatic dependency discovery and “time travel,” bridging the causal link between actual changes to the deployment and the resultant logs, metrics, and traces.
Cilium : An open-source project that leverages eBPF to provide a Container Network Interface plugin for Kubernetes for networking, security, and observability.
Isovalent offers a commercial distribution of Cilium with additional features, support, and integration with other cloud-native technologies.
Katran: Facebook’s open-source layer 4 load balancer that utilises eBPF to achieve extremely low latency, processing packets in as little as 50 nanoseconds.

eBPF has found novel applications beyond just observability. Take network security, where eBPF is being leveraged to create powerful firewalls and intrusion detection systems. By attaching eBPF programs to network hooks, administrators can define complex filtering rules and inspect packets at a granular level. This enables real-time detection and mitigation of threats, such as DDoS attacks, malware propagation, and unauthorised access attempts. eBPF’s ability to dynamically update filtering rules without modifying the kernel itself enables adaptive security policies that can respond to emerging threats promptly. Tetragon is a key player in this area.

How eBPF works

One of the key mechanisms used by eBPF for observability is kernel probes (kprobes). Kprobes allow eBPF programs to be attached to specific kernel functions or instructions. When the designated kernel code is executed, the eBPF program is triggered, allowing it to collect relevant data and metrics. Consider the following C snippet:

SEC("kprobe/do_unlinkat")
int BPF_KPROBE(do_unlinkat, int dfd, struct filename *name)
{
    pid_t pid;
    const char *filename;

    pid = bpf_get_current_pid_tgid() >> 32;
    filename = BPF_CORE_READ(name, name);
    bpf_printk("KPROBE ENTRY pid = %d, filename = %s\n", pid, filename);
    return 0;
}

This code (eumonia-bpf 2023) monitors and captures the unlink system call executed in the Linux kernel, which is used to delete a file. This eBPF program places hooks at the entry and exit points of the do_unlinkat function using a kprobe.

We create a kprobe named BPF_KPROBE(do_unlinkat), which gets triggered when the do_unlinkat function is entered. It takes two parameters: dfd (file descriptor) and name (filename structure pointer). In this kprobe, we retrieve the PID (process identifier) of the current process and then read the filename. Finally, we use the bpf_printk function to print the PID and filename in the kernel log.

In addition to probes, eBPF offers other powerful observability mechanisms. eBPF programs can be attached to network sockets, enabling detailed monitoring of network traffic and protocols. They can also be used to trace function calls and measure latency, allowing developers to identify performance bottlenecks and optimise their applications.

eBPF’s observability capabilities are further enhanced by its ability to store and aggregate collected data in efficient data structures called BPF maps. These maps act as a communication channel between eBPF programs running in the kernel and user-space tools. They allow collected metrics and events to be accessed and analysed in real time, providing valuable insights into system behaviour.

Conclusion

What began as a simple way to manipulate basic networking in Linux has found a new lease of life in the genius of eBPF. The magic of eBPF lies in its ability to extend the kernel’s capabilities and enable deep observability without introducing significant overhead.

As we have seen, eBPF’s observability mechanisms are highly technical and require a good understanding of kernel and application internals. However, the benefits they bring regarding visibility and actionable insights make the learning curve worthwhile. Thankfully, a new cadre of products – groundcover, OpsCruise, and Tetragon among them – is opening up access to these benefits to the wider world.

References

#eBPF #Linux #Observability #CloudComputing #MegazoneCloud #Megazone #MZC

Written by Ollie Webster, Cloud Solutions Architect, MegazoneCloud Hong Kong

매달 마지막 주 수요일, IT 트렌드에 전문가 인사이트 더하기!

IT 트렌드에 전문가 인사이트 더하기!

[인증범위] 메가존클라우드 서비스 운영

Cloud MSP, HyperBilling, 융합평생교육원, MegazonePoPs, CloudPlex, SpaceONE, Hubble Security
(심사받지 않은 물리적 인프라 제외)

[유효기간] 2023.04.05-2026.04.04

KAB-IC-97