{
	"id": "50342e73-9b37-4336-bfad-81acd9fdc5ce",
	"created_at": "2026-04-06T00:12:26.28191Z",
	"updated_at": "2026-04-10T13:11:50.362492Z",
	"deleted_at": null,
	"sha1_hash": "2e83b260a6d2a7500d3b670a03cac212e27ce6b1",
	"title": "BPF Memory Forensics with Volatility 3",
	"llm_title": "",
	"authors": "",
	"file_creation_date": "0001-01-01T00:00:00Z",
	"file_modification_date": "0001-01-01T00:00:00Z",
	"file_size": 2368597,
	"plain_text": "BPF Memory Forensics with Volatility 3\r\nPublished: 2023-12-21 · Archived: 2026-04-02 12:29:23 UTC\r\nBPF Memory Forensics with Volatility 3\r\nIntroduction and Motivation\r\nHave you ever wondered how an eBPF rootkit looks like? Well, here’s one, have a good look:\r\nUpon receiving a command and control (C2) request, this specimen can execute arbitrary commands on the\r\ninfected machine, exfiltrate sensitive files, perform passive and active network discovery scans (like nmap ), or\r\nprovide a privilege escalation backdoor to a local shell. Of course, it’s also trying its best to hide itself from\r\nsystem administrators hunting it with different command line tools such as ps , lsof , tcpdump an others or\r\neven try tools like rkhunter or chkrootkit .\r\nWell, you say, rootkits have been doing that for more than 20 years now, so what’s the news here? The news aren’t\r\nthat much the features, but rather how they are implemented. Everything is realized using a relatively new and\r\nrapidly evolving kernel feature: eBPF. Even though it has been in the kernel for almost 10 years now, we’re\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 1 of 25\n\nregularly surprised by how many experienced Linux professionals are still unaware of its existence, not even to\r\nmention its potential for abuse.\r\nThe above picture was generated from the memory image of a system infected with ebpfkit , an open-source\r\nPoC rootkit from 2021, using a plugin for the Volatility 3 memory forensics framework. In this blog post, we will\r\npresent a total of seven plugins that, taken together, facilitate an in depth analysis of the state of the BPF\r\nsubsystem.\r\nWe structured this post as follows: The next section provides an introduction to the BPF subsystem, while the third\r\nsection highlights its potential for (ab)use by malware. In section four, we will introduce seven Volatility 3 plugins\r\nthat facilitate the examination of BPF malware. Section five presents a case study, followed by a section\r\ndescribing our testing and evaluation of the plugins on various Linux distributions. In the last section, we conclude\r\nwith a discussion of the steps that are necessary to integrate our work into the upstream Volatility project, other\r\nchallenges we encountered, and open research questions.\r\nNote: The words “eBPF” and “BPF” will be used interchangeably throughout this post.\r\nThe BPF Subsystem\r\nBefore delving into the complexities of memory forensics, it is necessary to establish some basics about the BPF\r\nsubsystem. Readers that are already familiar with the topic can safely skip this section.\r\nTo us, BPF is first of all an instruction set architecture (ISA). It has ten general purpose registers, which are 64\r\nbit wide, and there are all of the basic operations that you would expect a modern ISA to have. Its creator, Alexei\r\nStarovoitov, once described it as a kind of simplified x86-64 and would probably never have imagined that the\r\nISA he cooked up back in 2014 would once enter a standardization process at the IETF. The interested reader can\r\nfind the current proposed standard here . Of course, there are all the other things that you would expect to come\r\nwith an ISA, like an ABI that defines the calling convention, and a binary encoding that maps instructions to\r\nsequences of four or eight bytes.\r\nThe BPF ISA is used as a compilation target (currently by clang - gcc support is on the way) for programs written\r\nin high-level languages (currently C and Rust), however, it is not meant to be implemented in hardware.\r\nTherefore, it is conceptually more similar to WebAssembly or Java Bytecode than x86-64 or arm64, i.e., BPF\r\nprograms are meant to be executed by a runtime that implements the BPF virtual machine (VM). Several BPF\r\nruntimes exist, but the “reference implementation” is in the Linux kernel.\r\nRuntimes are, of course, free to choose how they implement the BPF VM. The instruction set was defined in a\r\nway that makes it easy to implement a one-to-one just in time (JIT) compiler for many CPU architectures. In fact,\r\nin the Linux kernel, even non-mainstream architectures like powerpc, sparc or s390 have BPF JITs. However, the\r\nkernel also has an interpreter to run BPF programs on architectures that do not yet support JIT compilation.\r\nAside: The BPF platform is what some call a “verified target”. This means that in order for a program to be valid\r\nit has to have some “non-local” properties. Those include the absence of (unbounded) loops, registers and\r\nmemory can only be read after they have been written to, the stack depth may not exceed a hard limit, and many\r\nmore. The interested reader can find a more exhaustive description here . In practice, runtime implementations\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 2 of 25\n\ninclude an up-front static verification stage and refuse to execute programs that cannot be proven to meet these\r\nrequirements (some runtime checks may be inserted to account for the known shortcomings of static analysis).\r\nThis static verification approach is at the hearth of BPF’s sandboxing model for untrusted code.\r\nRoughly speaking, the BPF subsystem includes, besides the implementation of the BPF VM, a user and kernel\r\nspace interface for managing the program life cycle as well as infrastructure for transitioning the kernel control\r\nflow in and out of programs running inside the VM. Other subsystems can be made “programmable” by\r\nintegrating the BPF VM in places where they want to allow the calling of user-defined functions, e.g., for decision\r\nmaking based on their return value. The networking subsystem, for example, supports handing all incoming and\r\noutgoing packets on an interface to a BPF program. Those programs can freely rewrite the packet buffer or even\r\ndecide to drop the packet all together. Another example is the tracing subsystem that supports transitioning control\r\ninto BPF programs at essentially any instruction via one of the various ways it has to hook into the kernel and user\r\nspace execution. The final example here is the Linux Security Module (LSM) subsystem that supports calling out\r\nto BPF programs at any of its security hooks placed at handpicked choke points in the kernel. There are many\r\nmore examples of BPF usage in the kernel and even more in academic research papers and patches on the mailing\r\nlist, but we guess we conveyed the general idea.\r\nBPF programs can interact with the world outside of the VM via so called helpers or kfuncs, i.e., native kernel\r\nfunctions that can be called by BPF programs. Services provided by these functions range from getting a\r\ntimestamp to sending a signal to the current task or reading arbitrary memory. Which functions a program can call\r\ndepends on the program type that was selected when loading it into the VM. When reversing BPF programs,\r\nlooking for calls to interesting kernel functions is a good point to start.\r\nThe second ingredient you need in order to get any real work done with a BPF program are maps. While\r\nprograms can store data during their execution using stack memory or by allocating objects on the heap, the only\r\nway to persist data across executions of the same program are maps. Maps are mutable persistent key value stores\r\nthat can be accessed by BPF programs and user space alike, as such they can be used for user-to-BPF, BPF-to-user, or BPF-to-BPF communication, where in the last case the communicating programs may be different or the\r\nsame program at different times.\r\nAnother relevant aspect of the BPF ecosystem is the promise of compile once run everywhere (CORE), i.e., a\r\n(compiled) BPF program can be run inside of a wide range of Linux kernels that might have different\r\nconfigurations, versions, compilers, and even CPU architectures. This is achieved by having the compiler emit\r\nspecial relocation entries that are processed by a user-space loader prior to loading a program into the kernel’s\r\nBPF VM. The key ingredient that enables this approach is a self-description of the running kernel in the form of\r\nBPF Type Format (BTF) information, which is made available in special files under /sys/kernel/btf/ . For\r\nexample, BPF source code might do something like current-\u003ecomm to access the name of the process in whose\r\ncontext the program is running. This might generate an assembly instruction that adds the offset of the comm field\r\nto a pointer to the task descriptor that is stored in a register, i.e., ADD R5, IMM . However, the immediate offset\r\nmight vary due to kernel version, configuration, structure layout randomization or CPU architecture. Thus, the\r\ncompiler would emit a relocation entry that tells the user-space loader running on the target system to check the\r\nkernel’s BTF information in order to overwrite the placeholder with the correct offset. Together with other kinds\r\nof relocations, which address things like existence of types and enum variants or their sizes, the loader be used to\r\nrun the same BPF program on a considerable number of kernels.\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 3 of 25\n\nAside: A problem with the CORE implementation described above is that signatures over BPF programs are\r\nmeaningless as the program text will be altered by relocations before loading. To allow for a meaningful ahead of\r\ntime signature there is another approach in which a loader program is generated for the actual program. The\r\nloader program is portable without relocations and is signed and loaded together with the un-relocated bytecode\r\nof the actual program. Thus, the problem is solved as all text relocations happen in the kernel, i.e., after signatures\r\nhave been verified.\r\nHowever, there are of course limits to the portability of BPF programs. As we all know, the kernel takes great care\r\nto never break user space, within kernel land, on the other hand, there are no stability guarantees at all. BPF\r\nprograms are not considered to be part of user space and thus there are no forward or backward compatibility\r\nguarantees. In practice, that means that APIs exposed to BPF could be removed or changed, attachment points\r\ncould vanish or change their signature, or programs that are currently accepted by the static verifier could be\r\nrejected in the future. Furthermore, changes in kernel configuration could remove structure fields, functions, or\r\nkernel APIs that programs rely on. In that sense, BPF programs are in a position similar to out-of-tree kernel\r\nmodules. That being said, due to CORE, there is no need to have the headers of the target kernel available at\r\ncompile time and thus a lot less knowledge about the target is needed to be confident that the program will be able\r\nto run successfully. Furthermore, in the worst case the program will be rejected by the kernel, but there are no\r\nnegative implications on system stability by attempting to load it.\r\nFinally, we should mention that BPF is an entirely privileged interface. There are multiple BPF-related capabilities\r\nthat a process can have, which open up various parts of the subsystem. This has not always been the case. A few\r\nyears ago, unprivileged users were able to load certain types of BPF programs, however, access to the BPF VM\r\ncomes with two potential security problems. First, the security entirely relies on the correctness of the static\r\nverification stage, which is notoriously complex and must keep up with the ever-expanding feature set. It has been\r\ndemonstrated that errors in the verification process can be exploited for local privilege escalation, e.g., CVE-2020-\r\n8835 or CVE-2021-3490 . Second, even within the boundaries set by the verifier, the far-reaching control over the\r\nCPU instructions that get executed in kernel mode opens up the door for Spectre attacks, c.f., Jann Horn’s writeup\r\nor the original Spectre paper . For those reasons, the kernel community has decided to remove unprivileged access\r\nto BPF by default .\r\nBPF Malware\r\nTo better understand the implications the addition of the BPF VM has for the Linux malware landscape, we would\r\nlike to start with a quote from “BPF inventor” Alexei Starovoitov: “If in the past the whole kernel would maybe\r\nbe [a] hundred of programmers across the world, now a hundred thousand people around the world can program\r\nthe kernel thanks to BPF.”, i.e., BPF significantly lowers the entry barrier to kernel programming and shipping\r\napplications that include kernel-level code. While the majority of new kernel programmers are well-intentioned\r\nand aim to develop innovative and useful applications, experience has shown that there will be some actors who\r\nseek to use new kernel features for malicious purposes.\r\nFrom a malware author’s perspective, one of the first questions is probably how likely it is that a target system\r\nwill support the loading of malicious BPF programs. According to our personal experience it is safe to say that\r\nmost general-purpose desktop and server distributions enable BPF. The feature is also enabled in the android-base.config as BPF plays a significant role in the Android OS, i.e., essentially every Android device should\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 4 of 25\n\nsupport BPF - from your fridge to your phone. Concerning the custom kernels used by big tech companies let me\r\nquote Brendan Gregg, another early BPF advocate: “As companies use more and more eBPF also, it becomes\r\nharder for your operating system to not have eBPF because you are no longer eligible to run workloads at Netflix\r\nor at Meta or at other companies.”. What is more, Google relies on BPF (through cilium ) in its Kubernetes\r\nengine and Facebook uses it for its layer 4 load balancer katran . For a more comprehensive survey of BPF\r\nusage in cloud environments we recommend section 5 of Cross Container Attacks: The Bewildered eBPF on\r\nClouds by Yi He et al. Thus, most of the machines that constitute “the cloud” are likely to support BPF. This is\r\nparticularly interesting as signature verification for BPF programs is still not available, making it the only way to\r\nrun kernel code on locked-down systems that restrict the use of kernel modules.\r\nHowever, enabling the BPF subsystem, i.e., CONFIG_BPF , is only the beginning of the story. There are many\r\ncompile-time or run-time configuration choices that affect the capabilities granted to BPF programs, and thus the\r\nways in which they can be used to subvert the security of a system. Giving a full overview of all the available\r\nswitches and their effect would exceed the scope of this post, however, we will mention some knobs that can be\r\nturned to stop the abuses mentioned below.\r\nIf you search for the term “BPF malware” these days, you will find rather sensational articles with titles like\r\n“eBPF: A new frontier for malware”, “How BPF-Enabled Malware Works”, “eBPF Offensive Capabilities – Get\r\nReady for Next-gen Malware”, “Nothing is Safe Anymore - Beware of the “eBPF Trojan Horse” or “HOW DOES\r\nEBPF MALWARE PERFORM AGAINST STAR LAB’S KEVLAR EMBEDDED SECURITY?”. Needless to\r\nsay, that they contain hardly any useful information. The truth is that we are not aware of any reports of in-the-wild malware using BPF. Nevertheless, there is no shortage in open source PoC BPF malwares on GitHub. The\r\ntwo biggest ones are probably ebpfkit and TripeCross , however, there are many smaller projects like nysm ,\r\nsshd_backdoor , boopkit , pamspy , or bad bpf as well as snippet collections like nccgroup’s bpf tools , Offensive-BPF . Researchers also used malicious BPF programs to escape container isolation in multiple real-world cloud\r\nenvironments.\r\nThere are a couple of core shenanigans that those malwares are constructed around, three of which we will briefly\r\ndescribe here.\r\nIt is possible to transparently (for user space) skip the execution of any system call or to manipulate just the return\r\nvalue after it was executed. This is since BPF can be used for the purpose of error injection . To be precise, any\r\nfunction that is annotated with the ALLOW_ERROR_INJECTION macro can be manipulated in this way, and every\r\nsystem call is automatically annotated via the macro that defines it. One would hope that the corresponding\r\nconfigurations BPF_KPROBE_OVERRIDE and CONFIG_FUNCTION_ERROR_INJECTION would not be enabled in kernels\r\nshipped to end users, but they are. There are many things that one can do by lying to user space in this way, one\r\nexample would be to block the sending of all signals to a specific process, e.g., to protect it from being killed .\r\nInterestingly, the same helper is also used by BPF-based security solutions like tetragon , which are deployed in\r\nproduction cloud environments.\r\nAnother common primitive is to write to memory of the current process, which gives attackers the power to\r\nperform all sorts of interesting memory corruptions. One of the more original ideas is to inject code into a process\r\nby writing a ROP chain onto its stack. The chain sets up everything to load a shared library and cleanly resumes\r\nthe process afterwards. More generally, the helper bpf_probe_write_user is involved in many techniques to hide\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 5 of 25\n\nobjects, e.g., sockets or BPF programs, from user space or when manipulating apparent file and directory contents,\r\ne.g., /proc , /etc/sudoers or ~/.ssh/authorized_keys . In particular, those apparent modifications cannot be\r\ncaught with file system forensics as they are only happening in the memory of the process that attempts to access\r\nthe resource, e.g., see textreplace for an example that allows arbitrary apparent modifications of file contents.\r\nWhile there are in fact a couple of legitimate programs (like the Datadog-agent ) using this function, it is probably\r\nwise to enable CONFIG_LOCK_DOWN_KERNEL_FORCE_INTEGRITY before compilation.\r\nA rather peculiar aspect of BPF malware is how it communicates over the network. BPF programs are not able to\r\ninitiate network connections by themselves, but as one of the main applications of BPF is in the networking\r\nsubsystem, they have far-reaching capabilities when it comes to managing existing traffic. For example, XDP\r\nprograms get their hands on packets very early in the receive path, long before mechanisms like netfilter, which is\r\nmuch further up the network stack, get a chance to see them. In fact, there are high-end NICs that support running\r\nBPF programs on the device’s proces rather than the host CPU. Furthermore, programs that handle packets can\r\nusually modify, reroute, or drop them. In combination, this is often used to receive C2 commands while at the\r\nsame time hiding the corresponding packets from the rest of the kernel by modifying or dropping them. In\r\naddition, BPF’s easy programmability makes it simple to implement complex, stateful triggers. To exfiltrate data\r\nfrom the system, the contents, and potentially also the recipient data, of outgoing packets are modified, for\r\nexample by traffic control (tc) hooks. For unreliable transport protocols higher layers will deal with the induced\r\npacket loss, while for TCP the retransmission mechanism ensures that applications will not be impacted. Turn off\r\nCONFIG_NET_CLS_BPF and CONFIG_NET_ACT_BPF to disable tc BPF programs.\r\nWhile the currently charted BPF malware landscape is limited to hobby projects by security researchers and other\r\ninterested individuals, it would unfortunately not be unheard of that the same projects are eventually discovered\r\nduring real-world incidents. Advanced Linux malwares, on the other hand, will most likely choose to implement\r\ntheir own BPF programs when they believe that it is beneficial for their cause, for instance to avoid detection by\r\nusing a mechanism that is not yet well known to the forensic community. Some excerpts from the recent talk by\r\nKris Nova at DevOpsDays Kyiv give an interesting insight into the concerns that the Ukrainian computer security\r\ncommunity had, and still has, regarding the use of BPF in Russian attacks on their systems.\r\nIt would be dishonest to claim that there is a general schema that you can follow while analyzing an incident to\r\ndiscover all malicious BPF programs. As so often, the boundaries between monitoring software, live patches,\r\nsecurity solutions and malware are not clearly defined, e.g., in addition to bpf_override_retun tetragon also\r\nuses bpf_send_singal . The first step could be to obtain a baseline of expected BPF-related activity, and carefully\r\nanalyze any deviations or anomalies. Additionally, a look at the kernel configuration can help to decide which\r\nkinds of malicious activity are fundamentally possible. Furthermore, programs that make use of possibly\r\nmalicious helper functions, like bpf_probe_wite_user , bpf_send_signal , bpf_override_return , or\r\nbpf_skb_store_bytes should be reverse engineered with particular scrutiny. In addition, there are some clear\r\nindicators of malicious activity, like the hiding of programs, which we will discuss in more detail below. Finally,\r\nonce program signatures are upstreamed, it is highly recommended to enable and enforce them to lock down this\r\nattack surface.\r\nFrom now on, we will shift gears and focus on the main topic of this post, hunting BPF malware in main memory\r\nimages.\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 6 of 25\n\nAside: The bvp47 , Symbiote and BPFdoor rootkits are often said to be examples of BPF malware. However, they\r\nare using only what is now known as classic BPF, i.e., the old-school packet filtering programs used by programs\r\nlike tcpdump.\r\nVolatility Plugins\r\nVolatility is a memory forensics framework that can be used to analyze physical memory images. It uses\r\ninformation about symbols and types of the operating system that was running on the imaged system to recover\r\nhigh-level information, like the list of running processes or open files, from the raw memory image.\r\nIndividual analyses are implemented as plugins that make use of the framework library as well as other plugins.\r\nSome of those plugins are closely modeled after core unix utilities, like the ps utility for listing processes, the\r\nss utility for listing network connections or the lsmod utility for listing kernel modules. Other plugins\r\nimplement checks that search for common traces of kernel rootkit activity, like the replacement of function\r\npointers or inline hooks.\r\nThere may be multiple ways to obtain the same piece of information, and thus multiple plugins that, on first sight,\r\nserve the same purpose. Inconsistencies between the methods, however, could indicate malicious activity that\r\ntries to hide its presence or just be artifacts of imperfections in the acquisition process. In any case, inconsistencies\r\nare something an investigator should look into.\r\nIn this section we present seven Volatility plugins that we have developed to enable analysis of the BPF\r\nsubsystem. Three of these are modelled after subcommands of the bpftool utility and provide basic\r\nfunctionality. We then present three plugins that retrieve similar information from other sources and can thus be\r\nused to detect inconsistencies. Finally, we present a plugin that aggregates information from four other plugins to\r\nmake it easier to interpret.\r\n_Note: We published the source code for all of our plugins on GitHub . We would love to see your contributions\r\nthere! :)\r\nListing Programs, Maps \u0026 Links\r\nArguably the most basic task that you could think of is simply listing the programs that have been loaded into the\r\nBPF VM. We will start by doing this on a live system, feel free to follow along in order to discover what your\r\ndistribution or additional packages that you installed have already loaded.\r\nLive System\r\nThe bpftool user-space utility allows admins to interact with the BPF subsystem. One of the most basic tasks it\r\nsupports is the listing of all loaded BPF programs, maps, BTF sections, or links. We are sometimes going to refer\r\nto these things collectively as BPF objects. Roughly speaking, links are a mechanism to connect a loaded program\r\nto a point where it is being invoked, and BTF is a condensed form of DWARF debug information.\r\nLets start with an example to get familiar with the information that is displayed (run btftool as root ):\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 7 of 25\n\n# bpftool prog list\r\n[...]\r\n22: lsm name restrict_filesystems tag 713a545fe0530ce7 gpl\r\nloaded_at 2023-11-26T10:31:42+0100 uid 0\r\nxlated 560B jited 305B memlock 4096B map_ids 13\r\nbtf_id 53\r\n[...]\r\nFrom left-to-right and top-to-bottom we have: ID used as an identifier for user-space, program type, program\r\nname, tag that is a SHA1 hash over the bytecode, license, program load timestamp, uid of process that loaded it,\r\nsize of the bytecode, size of the jited code, memory blocked by the program, ids of the maps that the program is\r\nusing, ids to the BTF information for the program.\r\nWe can also inspect the bytecode\r\n# bpftool prog dump xlated id 22\r\nint restrict_filesystems(unsigned long long * ctx):\r\n; int BPF_PROG(restrict_filesystems, struct file *file, int ret)\r\n 0: (79) r3 = *(u64 *)(r1 +0)\r\n 1: (79) r0 = *(u64 *)(r1 +8)\r\n 2: (b7) r1 = 0\r\n[...]\r\nwhere each line is the pseudocode of a BPF assembly instruction and we even have line info, which is also stored\r\nin the attached BTF information. We can also dump the jited version and confirm that is is essentially a one-to-one\r\ntranslation to x86_64 machine code (depending on the architecture your kernel runs on):\r\n# bpftool prog dump jited id 22\r\nint restrict_filesystems(unsigned long long * ctx):\r\nbpf_prog_713a545fe0530ce7_restrict_filesystems:\r\n; int BPF_PROG(restrict_filesystems, struct file *file, int ret)\r\n 0: endbr64\r\n 4: nopl (%rax,%rax)\r\n 9: nop\r\n b: pushq %rbp\r\n c: movq %rsp, %rbp\r\n f: endbr64\r\n 13: subq $24, %rsp\r\n 1a: pushq %rbx\r\n 1b: pushq %r13\r\n 1d: movq (%rdi), %rdx\r\n 21: movq 8(%rdi), %rax\r\n 25: xorl %edi, %edi\r\n[...]\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 8 of 25\n\nFurthermore, we can display basic information about the maps used by the program\r\n# bpftool map list id 13\r\n13: hash_of_maps name cgroup_hash flags 0x0\r\nkey 8B value 4B max_entries 2048 memlock 165920B\r\nas well as their contents (which are quite boring in this case).\r\n# bpftool map dump id 13\r\nFound 0 elements\r\nWe can also get information about the variables and types (BTF) defined in the program. This is somewhat\r\ncomparable to the DWARF debug information that comes with some binaries - just that it is harder to strip since\r\nits needed by the BPF VM.\r\n# bpftool btf dump id 53\r\n[1] PTR '(anon)' type_id=3\r\n[2] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED\r\n[3] ARRAY '(anon)' type_id=2 index_type_id=4 nr_elems=13\r\n[4] INT '__ARRAY_SIZE_TYPE__' size=4 bits_offset=0 nr_bits=32 encoding=(none)\r\n[5] PTR '(anon)' type_id=6\r\n[6] TYPEDEF 'uint64_t' type_id=7\r\n[7] TYPEDEF '__uint64_t' type_id=8\r\n[8] INT 'unsigned long' size=8 bits_offset=0 nr_bits=64 encoding=(none)\r\n[9] PTR '(anon)' type_id=10\r\n[10] TYPEDEF 'uint32_t' type_id=11\r\n[11] TYPEDEF '__uint32_t' type_id=12\r\n[12] INT 'unsigned int' size=4 bits_offset=0 nr_bits=32 encoding=(none)\r\n[13] STRUCT '(anon)' size=24 vlen=3\r\n'type' type_id=1 bits_offset=0\r\n'key' type_id=5 bits_offset=64\r\n'value' type_id=9 bits_offset=128\r\n[...]\r\nAs we said earlier, links are what connects a loaded program to a point that invokes it.\r\n# bpftool link list\r\n[...]\r\n3: tracing prog 22\r\nprog_type lsm attach_type lsm_mac\r\ntarget_obj_id 1 target_btf_id 82856\r\nAgain, from left-to-right and top-to-bottom we have: ID, type, attached program’s ID, program’s load type, type\r\nthat program was attached with, ID of the BTF object that the following field refers to, ID of the type that the\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 9 of 25\n\nprogram is attached to (functions can also have BTF entries). Note that everything but the first line depends on the\r\ntype of link that is examined. To find the point where the program is called by the kernel we can inspect the\r\nrelevant BTF object (the kernel’s in this case).\r\n# bpftool btf dump id 1 | rg 82856\r\n[82856] FUNC 'bpf_lsm_file_open' type_id=16712 linkage=static\r\nThus we can conclude that the program is invoked early in the do_dentry_open function via the\r\nsecurity_file_open LSM hook and that its return value decides whether the process will be allowed to open the\r\nfile (we’re skipping some steps here, see our earlier article for the full story).\r\nWe performed this little “live investigation” on a laptop running Arch Linux with kernel 6.6.2-arch1-1 and the\r\nprogram wasn’t malware but rather loaded by systemd on boot. You can find the commit that introduced the\r\nfeature here . Again, you can see that in the future there will be more legitimate BPF programs running on your\r\nsystems (servers, desktops and mobiles) than you might think!\r\nMemory Image\r\nAs a first step towards BPF memory forensics it would be nice to be able to perform the above investigation on a\r\nmemory image. We will now introduce three plugins that aim to make this possible.\r\nWe already saw that all sorts of BPF objects are identified by an ID. Internally, these IDs are allocated using the\r\nIDR mechanism , a core kernel API. For that purpose, three variables are defined at the top of\r\n/kernel/bpf/syscall.c .\r\n[...]\r\nstatic DEFINE_IDR(prog_idr);\r\nstatic DEFINE_SPINLOCK(prog_idr_lock);\r\nstatic DEFINE_IDR(map_idr);\r\nstatic DEFINE_SPINLOCK(map_idr_lock);\r\nstatic DEFINE_IDR(link_idr);\r\nstatic DEFINE_SPINLOCK(link_idr_lock);\r\n[...]\r\nUnder the hood, the ID allocation mechanism uses an extensible array (xarray) , a tree-like data structure that is\r\nrooted in the idr_rt member of the structure that is defined by the macro. The ID of a new object is simply an\r\nunused index into the array, and the value stored at this index is a pointer to a structure that describes it. Thus, we\r\ncan re-create the listing capabilities of bpftool by simply iterating the array. You can find the code that does so\r\nin the XArray class.\r\nDereferencing the array entries leads us to structures that hold most of the information displayed by bpftool\r\nearlier.\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 10 of 25\n\nEntries of the prog_idr point to objects of type bpf_prog , the aux member of this type points to a structure\r\nthat hols additional information about the program. We can see how the information bpftool displays is\r\ngenerated from these structures in the bpf_prog_get_info_by_fd function by filling a bpf_prog_info struct.\r\nThe plugin bpf_listprogs re-implements some of the logic of this functions and displays the following pieces of\r\ninformation.\r\ncolumns: list[tuple[str, type]] = [\r\n (\"OFFSET (V)\", str),\r\n (\"ID\", int),\r\n (\"TYPE\", str),\r\n (\"NAME\", str),\r\n (\"TAG\", str),\r\n (\"LOADED AT\", int),\r\n (\"MAP IDs\", str),\r\n (\"BTF ID\", int),\r\n (\"HELPERS\", str),\r\n]\r\nSome comments are in order:\r\nOFFSET (V) are the low 6 bytes of the bpf_prog structure’s virtual address. This is useful as a unique\r\nidentifier of the structure.\r\nLOADED AT is the number of nanoseconds since boot when the program was loaded. Converting it to an\r\nabsolute timestamp requires parsing additional kernel time-keeping structures and is not in scope for this\r\nplugin. There exist Volatility patches that add this functionality but they are not upstream yet. Once they\r\nare, it should be trivial to convert this field to match the bpftool output.\r\nHELPERS is a field that is not reported by bpftool . It displays a list of all the kernel functions that are\r\ncalled by the BPF program, i.e., BPF helpers and kfuncs, and is helpful to quickly identify programs that\r\nuse possibly malicious or non-standard helpers.\r\nThe reporting of memory utilization is omitted as we consider it to be less important for forensic\r\ninvestigations, however, it would be easy to add.\r\nThe second bpftool functionality the plugin supports is the dumping of programs in bytecode and jited forms.\r\nTo dump the machine code of the program, we follow the bpf_func pointer in the bpf_prog structure, which\r\npoints to the entrypoint of the jited BPF program. The length of the machine code is stored in the jited_len\r\nfield of the same structure. While we support dumping the raw bytes to a file, their analysis is tedious due to\r\nmissing symbol information. Thus, we also support disassembling the program and annotating all occurring\r\naddresses with the corresponding symbol, which makes the programs much easier to analyze.\r\nDumping the BPF bytecode is straightforward as well. The flexible insni array member of the bpf_prog\r\nstructure holds the bytecode instructions and the len field holds their number. Here, we also support dumping\r\nthe raw and disassembled bytecode. However, the additional symbol annotations are not implemented. As the\r\nbytecode is not “what actually runs”, we consider this information more susceptible to anti-forensic tampering and\r\nthus focused on the machine code, which is what is executed when invoking the program.\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 11 of 25\n\nNote: We use Capstone for disassembling the BPF bytecode. Unfortunately, Capstone’s BPF architecture is\r\noutdated and thus bytecode is sometimes not disassembled entirely. As a workaround, you can dump the raw bytes\r\nand use another tool to disassemble them.\r\nEntries of the map_idr point to bpf_map objects. The bpf_map_info structure parsed by bpftool is filled in\r\nbpf_map_get_info_by_fd and the plugin bpf_listmaps is simply copying the logic to display the following\r\npieces of information.\r\ncolumns: list[tuple[str, Any]] = [\r\n (\"OFFSET (V)\", str),\r\n (\"ID\", int),\r\n (\"TYPE\", str),\r\n (\"NAME\", str),\r\n (\"KEY SIZE\", int),\r\n (\"VALUE SIZE\", int),\r\n (\"MAX ENTRIES\", int),\r\n]\r\nDumping the contents of maps is hard due to the diversity in map types. Each map type requires its own handling,\r\nbeginning with manually downcasting the bpf_map object to the correct container type. One approach to avoid\r\nimplementing each lookup mechanism separately, would be through emulation of the map_get_next_key and\r\nbpf_map_copy_value kernel functions, where the former is a function pointer found in the map’s operations\r\nstructure. However, this is not in scope for the current plugin.\r\nFurthermore, the dumping could be enhanced by utilizing the BTF information that is optionally attached to the\r\nmap to properly display keys and values, similar to the bpf_snprintf_btf helper that can be used to pretty-print\r\nobjects using their BTF information.\r\nWe implemented the dumping for the most straightforward map type - arrays - but the plugin does not support\r\ndumping other types of maps.\r\nEntries of the link_idr point to objects of type bpf_link . Again, there is an informational structure,\r\nbpf_link_info , which is this time filled in the bpf_link_get_info_by_fd function. By analyzing this function,\r\nwe wrote the bpf_listlinks plugin that retrieves the following pieces of information.\r\ncolumns: list[tuple[str, Any]] = [\r\n (\"OFFSET (V)\", str),\r\n (\"ID\", int),\r\n (\"TYPE\", str),\r\n (\"PROG\", int),\r\n (\"ATTACH\", str),\r\n]\r\nHere, the last column is obtained by mimicking the virtual call to link-\u003eops-\u003efill_link_info that adds link-type specific information about the associated attachment point, e.g., for tracing links it adds the BTF object and\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 12 of 25\n\ntype IDs we saw earlier.\r\nLSM Hooks\r\nOur three listing plugins have one conceptual weakness in common: they rely entirely on information obtained by\r\nparsing the (prog|map|link)_idr s. However, the entire ID mechanism is in the user-facing part of the BPF\r\nsubsystem, its simply a means for user space to refer to BPF objects in syscalls. Thus, our plugins are susceptible\r\nto trivial anti-forensic tampering.\r\nIn our research, we prototyped two anti-forensic methods that remove BPF objects from these structures while still\r\nkeeping the corresponding program active in the kernel. First, the more straightforward way is to simply write a\r\nkernel module that uses standard APIs to remove IDs from the IDRs. The second one is based on the observation\r\nthat the lifecycle of BPF objects is managed via reference counts. Thus, if we artificially increment the reference\r\ncount of an object that (indirectly) holds references to all other objects that are required to operate a BPF program,\r\ne.g., a link, we can prevent the program’s destruction when all “regular” references are dropped.\r\nOne approach to counter these anti-forensic measures is to “approach from the other side”. Instead of relying on\r\ninformation from sources that are far detached from the actual program execution, we go to the very places and\r\nmechanisms that invoke the program. The downside is obviously that this low-level code is much more program-type and architecture specific, the results, on the other hand, are more robust.\r\nIn a previous blog post we described the low-level details that lead up to the execution of BPF LSM programs in\r\ngreat detail. Based on this knowledge, we developed the bpf_lsm plugin that can discover hidden BPF programs\r\nattached to security hooks. In short, the plugin checks the places where the kernel control flow may be diverted\r\ninto the BPF VM for the presence of inline hooks. If they are found, it cross checks with the links IDR to see if\r\nthere is a corresponding link, the absence of which is a strong indication of tampering. Additionally, the plugin is\r\nalso valuable in the absence of tampering, as it shows you the exact program attachment point without the need to\r\nmanually resolve BTF IDs. In particular, the plugin displays the number of attached programs and their IDs along\r\nwith the name of the LSM hook where they are attached.\r\ncolumns: list[tuple[str, type]] = [\r\n (\"LSM HOOK\", str),\r\n (\"Nr. PROGS\", int),\r\n (\"IDs\", str),\r\n]\r\nNetworking Hooks\r\nAs we described above, traffic control (tc) programs are especially useful for exfiltrating information from\r\ninfected machines, e.g., by hijacking existing TCP connections. Thus, the second plugin that obtains its\r\ninformation from more tamper resistant sources targets tc BPF programs. It only relies on the mini_Qdisc\r\nstructure that is used on the transmission and receive fast paths to look up queuing disciplines (qdisc) attached to a\r\nnetwork device.\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 13 of 25\n\nWe use the ifconfig plugin by Ofek Shaked and Amir Sheffer to obtain a list of all network devices. Then, we\r\nfind the above-mentioned structure and use it to collect all BPF programs that are involved into qdiscs on this\r\ndevice. With kernel 6.3 the process of locating the mini_Qdisc from the network interface changed slightly due\r\nto the introduction of link-based attachment of tc programs, however, the plugin recognizes and handles both\r\ncases. Finally, the bpf_netdev plugin displays the following information about each interface where at least one\r\nBPF program was found,\r\ncolumns: list[tuple[str, type]] = [\r\n (\"NAME\", str),\r\n (\"MAC ADDR\", str),\r\n (\"EGRESS\", str),\r\n (\"INGRESS\", str),\r\n]\r\nwhere the EGRESS and INGRESS hold the IDs of the programs that process packets flowing into the respective\r\ndirection.\r\nFinding Processes\r\nYet another way to discover BPF objects is through the processes that hold on to them. As with many other\r\nresources, programs, links, maps, and btf are represented to processes as file descriptors. They can be used to act\r\non the object, retrieve information about it, and serve as a mechanism to clean up after processes that did not exit\r\ngracefully. Furthermore, an investigator might want to find out which process holds on to a specific BPF object in\r\norder to investigate this process further.\r\nThus, the bpf_listprocs plugin displays the following pieces of information for every process that holds on to\r\nat least one BPF object via a file descriptor.\r\ncolumns: list[tuple[str, type]] = [\r\n (\"PID\", int),\r\n (\"COMM\", str),\r\n (\"PROGS\", str),\r\n (\"MAPS\", str),\r\n (\"LINKS\", str),\r\n]\r\nHere, the PROGS . MAPS , and LINKS columns display the IDs of the respective objects. This list is generated by\r\niterating over all file descriptors and the associated file structures. BPF objects are identified by checking the\r\nfile operations f_op pointer, and the corresponding bpf_(prog|map|link) structures are found by following the\r\npointer stored in the private member.\r\nNot every BPF object must be reachable from the process list, however. They can, for example, also be\r\nrepresented as files under the special bpf filesystem, which is usually mounted at /sys/fs/bpf , or processes\r\ncan close file descriptors and the object will remain alive as long as there are other references to it.\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 14 of 25\n\nConnecting the Dots\r\nFinally, we would like to present the bpf_graph plugin, a meta analysis that we have build on top of the four\r\nlisting plugins. As its name suggest, its goal is to visualize the state of the BPF subsystem as a graph.\r\nThere are four types of nodes in this graph: programs, maps, links and processes. Different node types are\r\ndistinguished by shape. Within a node type, the different program/map/link types are distinguished by color and\r\nprocess nodes are colored based on their process ID (PID). Furthermore, map and program nodes are labeled with\r\nthe ID and name of the object, link nodes are labeled with the ID and attachment information of the link, and\r\nprocess nodes receive the PID and comm (name of the user-space program binary) of their process as labels.\r\nThere are three types of edges to establish relationships between nodes: file descriptor, link, and map. File\r\ndescriptor edges are dotted and connect processes to BPF objects that they have an open fd for. Link edges are\r\ndashed and connect BPF links to the program they reference. Finally, map edges are drawn solid and connect\r\nmaps to all of the programs that use them.\r\nEspecially for large applications with hundreds or even thousands of objects, it is essential to be able to filter the\r\ngraph to make it useful. We have therefore implemented two additional options that can be passed to the plugin.\r\nFirst, you can pass a list of node types to include in the output. Second, you can pass a list of nodes, and only the\r\nconnected components that contain at least one of those nodes will be drawn.\r\nThe idea of this plugin is to make the information of the four listing plugins more accessible to investigators by\r\ncombining it into a single picture. This is especially useful for complex applications with possibly hundreds of\r\nprograms and maps, or on busy systems where many different processes have loaded BPF programs.\r\nPlugin output comes in two forms, a dot-format encoding of the graph, where each BPF object node has metadata\r\ncontaining all of the plugin columns, and as a picture of the graph, drawn with a default layout algorithm. The\r\nlatter should suffice for most users, but the former allows advanced use-cases to do further processing.\r\nNote: We provide standalone documentation for all plugins in our project on GitHub.\r\nCase Study\r\nIn this section we will use the plugins to examine the memory image of a system with a high level of BPF activity.\r\nTo get a diverse set of small BPF applications we launched the example programs that come with libbpf-bootstrap\r\nand some of the kernel self-tests. You can download the memory image and symbols to follow along. If you prefer\r\nto analyze a single, large application have a look at the krie example in our plugin documentation .\r\nA good first step is to use the graph plugin to get an overview of the subsystem ( # vol -f /io/dumps/debian-bookworm-6.1.0-13-amd64_all.raw linux.bpf_graph ).\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 15 of 25\n\nAs we can see, there are several components corresponding to different processes, each of which holds a number\r\nof BPF resources. Let us begin by examining the “Hello, World” example of BPF, the minimal program:\r\n// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause\r\n/* Copyright (c) 2020 Facebook */\r\n#include \u003clinux/bpf.h\u003e\r\n#include \u003cbpf/bpf_helpers.h\u003e\r\nchar LICENSE[] SEC(\"license\") = \"Dual BSD/GPL\";\r\nint my_pid = 0;\r\nSEC(\"tp/syscalls/sys_enter_write\")\r\nint handle_tp(void *ctx)\r\n{\r\nint pid = bpf_get_current_pid_tgid() \u003e\u003e 32;\r\nif (pid != my_pid)\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 16 of 25\n\nreturn 0;\r\nbpf_printk(\"BPF triggered from PID %d.\\n\", pid);\r\nreturn 0;\r\n}\r\nThe above source code is compiled with clang to produce an ELF relocatable object file. It contains the BPF\r\nbytecode along with additional information, like BTF sections, CORE relocations, programs as well as their\r\nattachment mechanisms and points, maps that are used and so on. This ELF is then embedded into a user space\r\nprogram that statically links against libbpf. At runtime, it passed the ELF to libbpf, which takes care of all the\r\nrelocations and kernel interactions required to wire up the program to the BPF VM.\r\nWith the above C code in the back of our heads, we can now have a look at the relevant component of live\r\nsystem’s BPF object graph. To limit the output of the plugin to the connected components that contain certain\r\nnodes, we can add the --components flag to the invocation and give it a list of nodes (the format is\r\n\u003cnode_type\u003e-\u003cid\u003e where node_type is in {map,link,prog,proc} and id is the BPF object ID or PID).\r\nAs we can see, the ELF has caused libbpf to create a program, two maps and a link while loading. We can now use\r\nour plugins to gather more information about each object. Let’s start with the program itself.\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 17 of 25\n\n# vol -f /io/dumps/debian-bookworm-6.1.0-13-amd64_all.raw linux.bpf_listprogs --id 98 --dump-jited --dump-xlate\r\nVolatility 3 Framework 2.5.0\r\nProgress: 100.00 Stacking attempts finished\r\nOFFSET (V) ID TYPE NAME TAG LOADED AT MAP IDs BTF ID HELPERS\r\n0xbce500673000 98 TRACEPOINT handle_tp 6a5dcef153b1001e 1417821088492 40,45 196\r\nBy looking at the last column we can see that it is indeed using two kernel helper functions, where the apparent\r\ncall to bpf_printk turns out to be a macro that expands to bpf_trace_printk . If we look at the program byte\r\nand the machine code side by side, we can discover a few things.\r\n# cat .prog_0xbce500673000_98_bdisasm\r\n0x0: 85 00 00 00 10 b2 02 00 call 0x2b210\r\n0x8: 77 00 00 00 20 00 00 00 rsh64 r0, 0x20\r\n0x10: 18 01 00 00 00 a0 49 00 00 00 00 00 e5 bc ff ff lddw r1, 0xffffbce50049a000\r\n0x20: 61 11 00 00 00 00 00 00 ldxw r1, [r1]\r\n0x28: 5d 01 05 00 00 00 00 00 jne r1, r0, +0x5\r\n0x30: 18 01 00 00 10 83 83 f5 00 00 00 00 7b 9b ff ff lddw r1, 0xffff9b7bf5838310\r\n0x40: b7 02 00 00 1c 00 00 00 mov64 r2, 0x1c\r\n0x48: bf 03 00 00 00 00 00 00 mov64 r3, r0\r\n0x50: 85 00 00 00 80 0c ff ff call 0xffff0c80\r\n0x58: b7 00 00 00 00 00 00 00 mov64 r0, 0x0\r\n0x60: 95 00 00 00 00 00 00 00 exit\r\n# cat .prog_0xbce500673000_98_mdisasm\r\nhandle_tp:\r\n 0xffffc03772a0: 0f 1f 44 00 00 nop dword ptr [rax + rax]\r\n 0xffffc03772a5: 66 90 nop\r\n 0xffffc03772a7: 55 push rbp\r\n 0xffffc03772a8: 48 89 e5 mov rbp, rsp\r\n 0xffffc03772ab: e8 d0 fc aa f1 call 0xffffb1e26f80 # bpf_get_current_pid_tg\r\n 0xffffc03772b0: 48 c1 e8 20 shr rax, 0x20\r\n 0xffffc03772b4: 48 bf 00 a0 49 00 e5 bc ff ff movabs rdi, 0xffffbce50049a000 # minimal_.bss +\r\n 0xffffc03772be: 8b 7f 00 mov edi, dword ptr [rdi]\r\n 0xffffc03772c1: 48 39 c7 cmp rdi, rax\r\n 0xffffc03772c4: 75 17 jne 0xffffc03772dd # handle_tp + 0x3d\r\n 0xffffc03772c6: 48 bf 10 83 83 f5 7b 9b ff ff movabs rdi, 0xffff9b7bf5838310 # minimal_.rodat\r\n 0xffffc03772d0: be 1c 00 00 00 mov esi, 0x1c\r\n 0xffffc03772d5: 48 89 c2 mov rdx, rax\r\n 0xffffc03772d8: e8 13 57 a7 f1 call 0xffffb1dec9f0 # bpf_trace_printk\r\n 0xffffc03772dd: 31 c0 xor eax, eax\r\n 0xffffc03772df: c9 leave\r\n 0xffffc03772e0: c3 ret\r\n 0xffffc03772e1: cc int3\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 18 of 25\n\nThe first lesson here is probably that symbol annotations are useful :). As expected, when ignoring the prologue\r\nand epilogue inserted by the JIT-compiler, the translation between BPF and x86_64 is essentially one-to-one.\r\nFurthermore, uses of global C variables like my_pid or the format string result in direct references to kernel\r\nmemory, where the closest preceding symbols are the minimal_.bss ’s and minimal_.rodata ’s bpf_map\r\nstructures, respectively. For simple array maps, the bpf_map structure resides at the beginning of a buffer that\r\nalso holds the array data, 0x110 is simply the offset at which the map’s payload data starts. More generally,\r\nlibbpf will automatically create maps to hold the variables living in the .data , .rodata , and .bss sections.\r\nDumping the map contents confirms that the .bss map holds the minimal process’s PID while the .rodata\r\nmap contains the format string.\r\n# vol -f /io/dumps/debian-bookworm-6.1.0-13-amd64_all.raw linux.bpf_listmaps --id 45 40 --dump\r\nVolatility 3 Framework 2.5.0\r\nProgress: 100.00 Stacking attempts finished\r\nOFFSET (V) ID TYPE NAME KEY SIZE VALUE SIZE MAX ENTRIES\r\n0xbce500499ef0 40 ARRAY minimal_.bss 4 4 1\r\n0x9b7bf5838200 45 ARRAY minimal_.rodata 4 28 1\r\n# cat .map_0xbce500499ef0_40\r\n{\"0\": \"section (.bss) = {\\n (my_pid) (int) b'\\\\xb7\\\\x02\\\\x00\\\\x00'\\n\"}\r\n# cat .map_0x9b7bf5838200_45\r\n{\"0\": \"section (.rodata) = {\\n (handle_tp.____fmt) b'BPF triggered from PID %d.\\\\n\\\\x00'\\n\"}\r\nIn the source code we saw the directive SEC(\"tp/syscalls/sys_enter_write\") , which instructs the compiler to\r\nplace the handle_tp function’s BPF bytecode in an ELF section called \"tp/syscalls/sys_enter_write\" . While\r\nloading, libbpf picks this up and creates a link that attaches the program to a perf event that is activated by the\r\nsys_enter_write tracepoint. We can inspect the link, but getting more information about the corresponding trace\r\npoint is not yet implemented. Contributions are always highly welcome :)\r\n# vol -f /io/dumps/debian-bookworm-6.1.0-13-amd64_all.raw linux.bpf_listlinks --id 11\r\nVolatility 3 Framework 2.5.0\r\nProgress: 100.00 Stacking attempts finished\r\nOFFSET (V) ID TYPE PROG ATTACH\r\n0x9b7bc2c09ae0 11 PERF_EVENT 98\r\nDissecting the “Hello, World” programm was useful to get an impression of what a BPF application looks like at\r\nruntime. Before concluding this section, we will have a look at a less minimalist example, the process with PID\r\n687.\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 19 of 25\n\nThis process is one of the kernel self-tests. It tests a BPF feature that allows to load new function pointer tables\r\nused for dynamic dispatch (so called structure operations), where the individual operations are implemented as\r\nBPF programs, at runtime. The programs that implement the new operations can be recognized by their type\r\nSTRUCT_OPS .\r\n# vol -f /io/dumps/debian-bookworm-6.1.0-13-amd64_all.raw linux.bpf_listprogs --id 37 39 40 42 43 44 45\r\nVolatility 3 Framework 2.5.0\r\nProgress: 100.00 Stacking attempts finished\r\nOFFSET (V) ID TYPE NAME TAG LOADED AT MAP IDs BTF ID HELPERS\r\n0xbce5003b7000 37 STRUCT_OPS dctcp_init 562160e42a59841c 1417427431243 9,10,7 124\r\n0xbce50046b000 39 STRUCT_OPS dctcp_ssthresh cddbf7f9cf9b52d7 1417427590219 9 124\r\n0xbce500473000 40 STRUCT_OPS dctcp_update_alpha 6e84698df8007e42 1417427647277 9\r\n0xbce500487000 42 STRUCT_OPS dctcp_state dc878de7981c438b 1417427777414 9 124\r\n0xbce500493000 43 STRUCT_OPS dctcp_cwnd_event 70cbe888b7ece66f 1417427888091 9\r\n0xbce5004e5000 44 STRUCT_OPS dctcp_cwnd_undo 78b977678332d89f 1417428066805 9 124\r\n0xbce5004eb000 45 STRUCT_OPS dctcp_cong_avoid 20ff0d9ab24c8843 1417428109672 9\r\nThe mapping between the programs and the function pointer table they implement is realized through a special\r\nmap of type STRUCT_OPS created by the process.\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 20 of 25\n\n# vol -f /io/dumps/debian-bookworm-6.1.0-13-amd64_all.raw linux.bpf_listmaps --id 11 12\r\nVolatility 3 Framework 2.5.0\r\nProgress: 100.00 Stacking attempts finished\r\nOFFSET (V) ID TYPE NAME KEY SIZE VALUE SIZE MAX ENTRIES\r\n0x9b7bc3c41000 11 STRUCT_OPS dctcp_nouse 4 256 1\r\n0x9b7bc3c43400 12 STRUCT_OPS dctcp 4 256 1\r\nUnfortunately, the current implementation does not parse the contents of the map, so it cannot determine the name\r\nof the kernel structure being implemented and the mapping between its member functions and the BPF programs.\r\nAs always, contributions are highly welcome :). In this case, we would find out that it implements\r\ntcp_congestion_ops to load a new TCP congestion control algorithm on the fly.\r\nThere is a lot more to explore in this memory image, so feel free to have a closer look at the other processes. You\r\nmight also want to check out the krie example in our documentation to get an impression of a larger BPF\r\napplication.\r\nTesting\r\nWe tested the plugins on memory images acquired from virtual machines running on QEMU/KVM that were\r\nsuspended for the duration of the acquisition process. To ensure the correctness of all plugin results, we have\r\ncross-checked them by debugging the guest kernel as well as comparing them with bpftool running on the\r\nguest.\r\nBelow is a list of the distributions and releases that we used for manual testing\r\nDebian\r\n12.2.0-14, Linux 6.1.0-13\r\nUbuntu\r\n22.04.2, Linux 5.15.0-89-generic\r\n20.04, Linux 5.4.0-26-generic\r\nCustom\r\nLinux 6.0.12, various configurations\r\nLinux 6.2.12, various configurations\r\nFor each of these kernels, we tested at least all the plugins on an image taken during the execution of the libbpf-bootstrap example programs.\r\nAdditionally, to the above mentioned kernels we also developed an evaluation framework (the code is not public).\r\nThe framework is based on Vagrant and libvirt /KVM . First we create and update all VMs. After that we run\r\nprograms from libbpf-bootstrap with nohup so that we can leave the VM and dump the memory from\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 21 of 25\n\noutside. To dump the memory we use virsh with virsh dump \u003cname of VM\u003e --memory-only . virsh dump\r\npauses the VM for a clean acquisition of the main memory. We also install debug symbols for all the Linux\r\ndistributions under investigation so that we can gather the debug kernels ( vmlinux with DWARF debugging\r\ninformation) and the System.map file. We then use both files with dwarf2json to generate the ISF information\r\nthat Volatility 3 needs. Currently, we tested the following Linux distributions with their respective kernels:\r\nAlma Linux 9 - Linux kernel 5.14.0-362.8.1.el9_3.x86_64 ✅\r\nFedora 38 - Linux kernel 6.6.6-100.fc38.x86_64 ✅\r\nFedora 39 - Linux kernel 6.6.6-200.fc39.x86_64 ✅\r\nCentOS Stream 9 - Linux kernel 5.14.0-391.el9.x86_64 ✅\r\nRocky Linux 8 - Linux kernel 4.18.0-513.9.1.el8_9.x86_64 ✅\r\nRocky Linux 9 - 🪲 kernel-debuginfo-common package is missing so the kernel debugging symbols\r\ncannot be installed (list of packages )\r\nDebian 11 - Linux kernel 5.10.0-26-amd64 ✅\r\nDebian 12 - Linux kernel 6.1.0-13-amd64 ✅\r\nUbuntu 22.04 - Linux kernel 5.15.0-88-generic ✅\r\nUbuntu 23.10 - Linux kernel 6.5.0-10-generic ✅ (works partially, but process listing is broken due to this\r\ndwarf2json GitHub Issue )\r\nArchLinux - Linux kernel 6.6.7-arch1-1 ✅ (works partially, but breaks probably due to the same issue as\r\nvolatility3/dwarf2json GitHub Issue )\r\nopenSUSE Tumbleweed - ❓ it seems that the debug kernel that is provided by OpenSUSE does contain\r\ndebugging symbols but other sections such as .rodata are removed (zeroed out) so that dwarf2json is\r\nnot able to find the banner (further analyses cannot be carried out without this information) - we will\r\nfurther investigate this issue\r\nWe will check if the problems get resolved and re-evaluate our plugin. Generally, our framework is designed to\r\nsupport more distributions as well and we will try to evaluate the plugin on a wider variety of them.\r\nDuring our automated analysis we encountered an interesting problem. To collect the kernels with debugging\r\nsymbols from the VMs we need to copy them to the host. When copying the kernel executable file it will be read\r\ninto main memory by the kernel’s page-cache mechanism. This implies that parts of the kernel file (vmlinux) and\r\nthe kernel itself (the running kernel not the file) may be present in the dump. This can lead to the problem of the\r\nVolatility 3 function find_aslr (source code ) first finding matches in the page-cached kernel file (vmlinux) and\r\nnot in the running kernel. An issue has been opened here .\r\nThere are several articles on BPF that cover different security-related aspects of the subsystem. In this section, we\r\nwill briefly discuss the ones that are most relevant to the presented work.\r\nMemory Forensics: The crash utility, which is used to analyze live systems or kernel core dumps, has a bpf\r\nsubcommand that can be used to display information about BPF maps and programs. However, as it is not a\r\nforensics tool it relies solely on the information obtained via the prog_idr and map_ird . Similarly, the drgn\r\nprogrammable debugger comes with a script to list BPF programs and maps but suffers from the same problems\r\nwhen it comes to anti-forensic techniques. Furthermore, drgn and crash are primarily known as debugging\r\ntools for systems developers and as such not necessarily well-established in the digital forensics and incidence\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 22 of 25\n\nresponse (DFIR) community. In contrast, we implemented our analyses as plugins for the popular Volatility\r\nframework well-known in the DFIR community. Finally, A. Case and G. Richard presented Volatility plugins for\r\ninvestigating the Linux tracing infrastructure in their BlackHat US 2021 paper . Apart from a plugin that lists\r\nprograms by parsing the prog_idr , they have also implemented several plugins that can find BPF programs by\r\nanalyzing the data structures of the attachment mechanisms they use, such as kprobes, tracepoints or perf events.\r\nThus, their plugins are also able to discover inconsistencies that could reveal anti-forensic tampering. However,\r\nthey have never publicly released their plugins and despite several attempts we have been unable to contact the\r\nauthors to obtain a copy of the source code. Volatility already supports detecting BPF programs attached to\r\nsockets in its sockstat plugin. The displayed information is limited to names and IDs.\r\nReverse Engineering: Reverse engineering BPF programs is a key step while triaging the findings of our plugins.\r\nRecently, the Ghidra software reverse engineering (SRE) suite gained support for the BPF architecture , which\r\nmeans that its powerful decompiler can be used to analyze BPF bytecode extracted from kernel memory or user-space programs. Furthermore, BPF bytecode is oftentimes embedded into user-space programs that use framework\r\nlibraries to load it into the kernel at runtime. For programs written in the Go programming language, ebpfkit-monitor can parse the binary format of these embedded files to list the defined programs and maps as well as their\r\ninteractions. It uses this information to generate graphs that are similar to those of our bpf_graph plugin.\r\nAlthough the utility of these graphs has inspired our plugin, it is fundamentally different in that it displays\r\ninformation about the state of the kernel’s BPF subsystem extracted from a memory image. Consequently, it is\r\ninherently agnostic to the user-space framework that was used for compiling and loading the programs.\r\nAdditionally, it displays the actual state of the BPF subsystem instead of the BPF objects that might be created by\r\nan executable at runtime.\r\nRuntime Protection and Monitoring: Important aspects of countering BPF malware are preventing attackers\r\nfrom loading malicious BPF programs and logging suspicious events for later review. krie and ebpfkit-monitor are\r\ntools that can be used to log BPF-related events as well as to deny processes access to the BPF system call.\r\nSimply blocking access on a per-process basis is too course-grained for many applications and thus multiple\r\napproaches were proposed to implement a more fine-grained access control model for the BPF subsystem to\r\nfacilitate the realization of least privilege policies. Among those, one can further distinguish between proposals\r\nthat implement access control in user space, kernel space, or a hypervisor.\r\nbpfman (formerly known as bpfd) is a privileged user space daemon that acts as proxy for loading BPF programs\r\nand can be used to implement different access control policies. A combination of a privileged user-space daemon\r\nand kernel changes is used in the proposed BPF token approach that allows delegation of access to specific parts\r\nof the BPF subsystem to container processes by a privileged daemon.\r\nA fine-grained in-kernel access control is offered by the CapBits proposed by Yi He et al. Here, two bitfields are\r\nadded to the task_struct , where one defines the access that a process has to the BPF subsystem, e.g., allowed\r\nprogram types and helpers, and the other restricts the access that BPF programs can have on the process, e.g., to\r\nprevent it from being traced by kprobe programs. Namespaces are already used in many areas of the Linux kernel\r\nto virtualize global resources like PIDs or network devices. Thus, Y. Shao proposed introducing BPF namespaces\r\nto limit the scope of loaded programs to processes inside of the namespace. Finally, signatures over programs are a\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 23 of 25\n\nmechanism that allows the kernel to verify their provenance, which can be used analogous to module signatures\r\nthat prevent attackers from loading malicious kernel modules.\r\nLastly, Y. Wang et al. proposed moving large parts of the BPF VM from the kernel into a hypervisor, where they\r\nimplement a multi-step verification process that includes enforcing a security policy, checking signatures, and\r\nscanning for known malicious programs. In the security policy, allowed programs can be specified as a set of\r\ndeterministic finite automata, which allows for accepting dynamically generated programs without allowing for\r\narbitrary code to be loaded.\r\nAll these approaches are complementary to our plugins as they focus on reducing the chance that an attacker can\r\nsuccessfully load a malicious program, while we assume that this step has already happened and aim to detect\r\ntheir presence.\r\nConclusion\r\nIn this post, we gave an introduction to the Linux BPF subsystem and discussed its potential for abuse. We then\r\npresented seven Volatility plugins that allow investigators to detect BPF malware in memory images and evaluated\r\nthem on multiple versions of popular Linux distributions. To conclude the post, we will briefly discuss related\r\nprojects we are working on and plans for future work.\r\nThis project grew out of the preparation of a workshop on BPF rootkits at the DFRWS EU 2023 annual\r\nconference (materials ). We began working on this topic because we believe that the forensic community needs to\r\nexpand its toolbox in response to the rise of BPF in the Linux world to fill blind spots in existing analysis\r\nmethods. Additionally, investigators who may encounter BPF in their work should be made aware of the potential\r\nrelevance of the subsystem to their investigation.\r\nWhile the workshop, our plugins, and this post are an important step towards this goal, much work remains to be\r\ndone. First, in order for the present work to be useful in the real world our next goal must be to upstream most of\r\nit into the Volatility 3 project. Only this will ensure that investigators all around the world will be able to easily\r\nfind and use it. This will require:\r\nRefactoring of our utility code to use Volatility 3’s extension class mechanism\r\nThe bpf_graph plugin relies on networkx , which is not yet a dependency of Volatility 3. If the\r\nintroduction of a new dependency into the upstream project is not feasible, one could make it optional by\r\nchecking for the presence of the package within the plugin.\r\nAdditional testing on older kernel versions and kernels with diverse configurations to meet Volatility’s high\r\nstandards regarding compatibility\r\nWe will be happy to work with upstream developers to make the integration happen.\r\nFurthermore, there remains the problem of dealing with the wide variety of map types when extracting their\r\ncontents, as well as the related problem of pretty-printing them using BTF information. Here, we consider a\r\nmanual implementation approach to be impractical and would explore the possibility of using emulation of the\r\nrelevant functions.\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 24 of 25\n\nRegarding the advanced analysis aimed at countering anti-forensics, we have also implemented consistency\r\nchecks against the lists of kprobes and tracepoints, but these require further work to be ready for publication. We\r\nalso described additional analyses in our workshop that still need to be implemented.\r\nFinally, an interesting side effect of the introduction of BPF into the Linux kernel is that most of the functionality\r\nrequires BTF information for the kernel and modules to be available. This provides an easy solution to the\r\nproblem of obtaining type information from a raw memory image, a step that is central to automatic profile\r\ngeneration. We have already shown that it is possible to reliably extract BTF sections from memory images by\r\nimplementing a plugin for that. We have also explored the possibility of combining this with existing approaches\r\nfor extracting symbol information in order to obtain working profiles from a dump. While the results are\r\npromising, further work is needed to have a usable solution.\r\nAppendix\r\nA: Kernel Configuration\r\nThis section provides a list of compile-time kernel configuration options that can be adjusted to restrict the\r\ncapabilities of BPF programs. In general, it is recommended to disable unused features in order to reduce the\r\nattack surface of a system.\r\nBPF_SYSCALL=n : Disables the BPF system call. Probably breaks most systemd-based systems.\r\nDEBUG_INFO_BTF=n : Disables generation of BTF debug information, i.e., CORE no longer works on this\r\nsystem. Forces attackers to compile on/for the system they want to compromise.\r\nBPF_LSM=n : BPF programs cannot be attached to LSM hooks.\r\nLOCK_DOWN_KERNEL_FORCE_INTEGRITY=y : Prohibits the use of bpf_probe_write_user .\r\nNET_CLS_BPF=n and NET_ACT_BPF=n : BPF programs cannot be used in TC classifier actions. Stops some\r\ndata exfiltration techniques.\r\nFUNCTION_ERROR_INJECTION=n : Disables the function error injection framework, i.e., BPF programs can\r\nno longer use bpf_override_return .\r\nNETFILTER_XT_MATCH_BPF=n : Disables option to use BPF programs in nftables rules . Could be used to\r\nimplement malicious firewall rules.\r\nBPF_EVENTS=n : Removes the option to attach BPF programs to kprobes, uprobes, and tracepoints.\r\nBelow are options that limit features that we consider less likely to be used by malware.\r\nBPFILTER=n : This is an unfinished BPF-based replacement of iptables/nftables (currently not functional).\r\nLWTUNNEL_BPF=n : Disables the use of BPF programs for routing decisions in light weight tunnels.\r\nCGROUP_BPF=n : Disables the option to attach BPF programs to cgoups. Cgroup programs can monitor\r\nvarious networking-related events of processes in the group. Probably breaks most systemd-based systems.\r\nSource: https://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nhttps://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/\r\nPage 25 of 25",
	"extraction_quality": 1,
	"language": "EN",
	"sources": [
		"Malpedia"
	],
	"origins": [
		"web"
	],
	"references": [
		"https://lolcads.github.io/posts/2023/12/bpf_memory_forensics_with_volatility3/"
	],
	"report_names": [
		"bpf_memory_forensics_with_volatility3"
	],
	"threat_actors": [
		{
			"id": "8386d4af-5cca-40bb-91d7-aca5d1a0ec99",
			"created_at": "2022-10-25T16:07:23.414558Z",
			"updated_at": "2026-04-10T02:00:04.588816Z",
			"deleted_at": null,
			"main_name": "Bookworm",
			"aliases": [],
			"source_name": "ETDA:Bookworm",
			"tools": [
				"Agent.dhwf",
				"Chymine",
				"Darkmoon",
				"Destroy RAT",
				"DestroyRAT",
				"FF-RAT",
				"FormerFirstRAT",
				"Gen:Trojan.Heur.PT",
				"Kaba",
				"Korplug",
				"PlugX",
				"Poison Ivy",
				"RedDelta",
				"SPIVY",
				"Scieron",
				"Sogu",
				"TIGERPLUG",
				"TVT",
				"Thoper",
				"Xamtrav",
				"ffrat",
				"pivy",
				"poisonivy"
			],
			"source_id": "ETDA",
			"reports": null
		}
	],
	"ts_created_at": 1775434346,
	"ts_updated_at": 1775826710,
	"ts_creation_date": 0,
	"ts_modification_date": 0,
	"files": {
		"pdf": "https://archive.orkl.eu/2e83b260a6d2a7500d3b670a03cac212e27ce6b1.pdf",
		"text": "https://archive.orkl.eu/2e83b260a6d2a7500d3b670a03cac212e27ce6b1.txt",
		"img": "https://archive.orkl.eu/2e83b260a6d2a7500d3b670a03cac212e27ce6b1.jpg"
	}
}