BPFILTER: the next-generation Linux firewall
The Linux community has a continuous drive to enhance the GNU/Linux kernel. When we look at network traffic filtering, we moved from ipchains to iptables. More recently we saw the introduction of nftables. Next in line is BPFILTER, part of the development work for the Linux 4.18 kernel.
What is BPFILTER?
BPFILTER is short for BPF based packet filtering framework. In other words, it is a framework that does packet filtering and is based on BPF. Interestingly, BPF itself is an acronym for Berkeley Packet Filter. So it is clear that packet filtering is an important part of this feature.
To understand BPFILTER, we need to understand BPF first. The quick introduction to the technique is that it allows user space tools like tcpdump to filter traffic within the kernel. Let’s say that you want to see what traffic is received on port 80 (HTTP). We start the tcpdump tool and give it a port number.
tcpdump port 80
Now BPF in its turn will only return those packets that match this specified filter. Because it only needs to pass a limited subset of data, overhead is reduced and high performance is achieved.
So how does BPF work?
Instead of giving user space tools direct access to a raw network, BPF uses a pseudo-device. This means that it is like a controlled staging area. If allowed, BPF allows a tool such as tcpdump to retrieve data that comes from this staging area.
eBPF: Linux BPF implementation
With BPF originating from the BSD platform, it might be not surprising that Linux has a slightly different implementation. It uses eBPF, which stands for extended BPF. Since kernel 3.18 this implementation can be used also for non-networking activities like profiling. This is great to measure to perform debugging on processes. The 3.19 kernel (2015) added support to attach to sockets .
It is the Linux 4.x series that added interesting new features when it comes to network traffic filtering. For example, kernel version 4.1 (2015) provides ingress and egress filters. This allows us to influence incoming and outgoing traffic. Kernel 4.15 (2018) allows eBPF hooks for Linux Security Modules (LSM).
So in short, eBPF is multi-purpose and has become a powerful toolkit used by the Linux developers. No wonder that others have built great tooling around it, to allow performance measurements and troubleshooting. A good example is the work of Brendan Gregg who works for Netflix. Brendan has contributed a lot to BCC (BPF Compiler Collection) , which is a toolkit to that can retrieve data via eBPF. It helps to answer many questions, like:
- Which TCP connections are active?
- What are the latencies of requests to disk?
- Which MySQL queries are slower than the specified threshold?
- Which security capabilities are checked?
- What are the slowest EXT4 calls?
- Which NFS calls are made?
- And many more…
So with this introduction into BPF and eBPF on Linux, we can see the potential it has for network traffic filtering. Let’s move on and dive into BPFILTER.
Current status of BPFILTER
The development is currently at an early stage. Much of the work is done by Alexei Starovoitov, Daniel Borkmann, and David S. Miller. They work on the network layer and maintain eBPF. So no surprise that they are closely involved with work on BPFILTER. Some of the recent code can be found at in the bpfilter branch of Alexei.
Right now, BPFILTER works as follows: it converts netfilter rules used by iptables into BPF programs. These are little instructions that can be attached to parts in the kernel, like the networking stack. The conversion itself is so-called dynamic translation, also known as just-in-time (JIT) compilation or run-time compilation. This means it happens in user space and is executed when it is needed, instead ahead of time.
Benefits of BPFILTER
Due to the JIT compilation, most conversion work happens in user space. This simplifies the work needed in the kernel and allows for easier management of the code. Other benefits that are to be expected include hardware offloading, easier migrations from existing netfilter rules, and better performance.