Packt+ | Advance your knowledge in tech

You're reading from Learning Linux Binary Analysis

Product typeBook

Published inFeb 2016

Reading LevelIntermediate

PublisherPackt

ISBN-139781782167105

Edition1st Edition

Languages

Tools

Linux

Concepts

Data Analysis

Author (1)

Ryan "elfmaster" O'Neill

Chapter 9. Linux /proc/kcore Analysis

So far, we have covered Linux binaries and memory as it pertains to userland. This book won't be complete, however, if we don't spend a chapter on the Linux kernel. This is because it is actually an ELF binary as well. Similar to how a program is loaded into memory, the Linux kernel image, also known as vmlinux, is loaded into memory at boot time. It has a text segment and a data segment, overlaid with many section headers that are very specific to the kernel, and which you won't see in userland executables. We will also briefly cover LKMs in this chapter, as they are ELF files too.

Linux kernel forensics and rootkits

It is important to learn the layout of the Linux kernel image if you want to be a true master of kernel forensics in Linux. Attackers can modify the kernel memory to create very sophisticated kernel rootkits. There are quite a number of techniques out there for infecting a kernel at runtime. To list a few, we have the following:

A sys_call_table infection
Interrupt handler patching
Function trampolines
Debug register rootkits
Exception table infection
Kprobe instrumentation

The techniques listed here are the primary methods that are most commonly used by a kernel rootkit, which usually infects the kernel in the form of an LKM (short for Loadable Kernel Module). Getting an understanding of each technique and knowing where each infection resides within the Linux kernel and where to look in the memory are paramount to being able to detect this insidious class of Linux malware. Firstly, however, let's take a step back and see what we have to work with. Currently...

stock vmlinux has no symbols

Unless you have compiled your own kernel, you will not have a readily accessible vmlinux, which is an ELF executable. Instead, you will have a compressed kernel in /boot, usually named vmlinuz-<kernel_version>. This compressed kernel image can be decompressed, but the result is a kernel executable that has no symbol table. This poses a problem for forensics analysts or kernel debugging with GDB. The solution for most people in this case is to hope that their Linux distribution has a special package with their kernel version having debug symbols. If so, then they can download a copy of their kernel that has symbols from the distribution repository. In many cases, however, this is not possible, or not convenient for one reason or another. Nonetheless, this problem can be remedied with a custom utility that I designed and released in 2014. This tool is called kdress, because it dresses the kernel symbol table.

Actually, it is named after an old tool by Michael...

/proc/kcore and GDB exploration

The /proc/kcore technique is an interface for accessing kernel memory, and is conveniently in the form of an ELF core file that can be easily navigated with GDB.

Using GDB with /proc/kcore is a priceless technique that can be expanded to very in-depth forensics for the skilled analyst. Here is a brief example that shows how to navigate sys_call_table.

An example of navigating sys_call_table

$ sudo gdb -q vmlinux /proc/kcore
Reading symbols from vmlinux...
[New process 1]
Core was generated by `BOOT_IMAGE=/vmlinuz-3.16.0-49-generic root=/dev/mapper/ubuntu--vg-root ro quiet'.
#0  0x0000000000000000 in ?? ()
(gdb) print &sys_call_table
$1 = (<data variable, no debug info> *) 0xffffffff81801460 <sys_call_table>
(gdb) x/gx &sys_call_table
0xffffffff81801460 <sys_call_table>:  0xffffffff811d5260
(gdb) x/5i 0xffffffff811d5260
   0xffffffff811d5260 <sys_read>:  data32 data32 data32 xchg %ax,%ax
   0xffffffff811d5265 <sys_read+5>...

Direct sys_call_table modifications

Traditional kernel rootkits, such as adore and phalanx, worked by overwriting pointers in sys_call_table so that they would point to a replacement function, which would then call the original syscall as needed. This was accomplished by either an LKM or a program that modified the kernel through /dev/kmem or /dev/mem. On today's Linux systems, for security reasons, these writable windows into memory are disabled or are no longer capable of anything but read operations depending on how the kernel is configured. There have been other ways of trying to prevent this type of infection, such as marking sys_call_table as const so that it is stored in the .rodata section of the text segment. This can be bypassed by marking the corresponding PTE (short for Page Table Entry) as writeable, or by disabling the write-protect bit in the cr0 register. Therefore, this type of infection is a very reliable way to make a rootkit even today, but it is also very easily detected...

Kprobe rootkits

This particular type of kernel rootkit was originally conceived and described in great detail in a 2010 Phrack paper that I wrote. The paper can be found at http://phrack.org/issues/67/6.html.

This type of kernel rootkit is one of the more exotic brands in that it uses the Linux kernels Kprobe debugging hooks to set breakpoints on the target kernel function that the rootkit is attempting to modify. This particular technique has its limitations, but it can be quite powerful and stealthy. However, just like any of the other techniques, if the analyst knows what to look for, then the kernel rootkits that use kprobes can be quite easy to detect.

Detecting kprobe rootkits

Detecting the presence of kprobes by analyzing memory is quite easy. When a regular kprobe is set, a breakpoint is placed on either the entry point of a function (see jprobes) or on an arbitrary instruction. This is extremely easy to detect by scanning the entire code segment looking for breakpoints, as there is...

Debug register rootkits – DRR

This type of kernel rootkit uses the Intel Debug registers as a means to hijack the control flow. A great Phrack paper was written by halfdead on this technique. It is available here:

http://phrack.org/issues/65/8.html.

This technique is often hailed as ultra-stealth because it requires no modification of sys_call_table. Once again, however, there are ways of detecting this type of infection as well.

Detecting DRR

In many rootkit implementations, sys_call_table and other common infection points do go unmodified, but the int1 handler does not. The call instruction to the do_debug function gets patched to call an alternative do_debug function, as shown in the phrack paper linked earlier. Therefore, detecting this type of rootkit is often as simple as disassembling the int1 handler and looking at the offset of the call do_debug instruction, as follows:

target_address = address_of_call + offset + 5

If target_address has the same value as the do_debug address found in...

VFS layer rootkits

Another classic and powerful method of infecting the kernel is by infecting the kernel's VFS layer. This technique is wonderful and quite stealthy since it technically modifies the data segment in the memory and not the text segment, where discrepancies are easier to detect. The VFS layer is very object-oriented and contains a variety of structs with function pointers. These function pointers are filesystem operations such as open, read, write, readdir, and so on. If an attacker can patch these function pointers, then they can take control of these operations in any way that they see fit.

Detecting VFS layer rootkits

There are probably several techniques out there for detecting this type of infection. The general idea, however, is to validate the function pointer addresses and confirm that they are pointing to the expected functions. In most cases, these should be pointing to functions within the kernel and not to functions that exist in LKMs. One quick approach to detecting...

Other kernel infection techniques

There are other techniques available for hackers for the purpose of infecting the Linux kernel (we have not discussed these in this chapter), such as hijacking the Linux page fault handler (http://phrack.org/issues/61/7.html). Many of these techniques can be detected by looking for modifications to the text segment, which is a detection approach that we will examine further in the next sections.

vmlinux and .altinstructions patching

In my opinion, the single most effective method of rootkit detection can be summed up by verifying the code integrity of the kernel in the memory—in other words, comparing the code in the kernel memory against the expected code. But what can we compare kernel memory code against? Well, why not vmlinux? This was an approach that I originally explored in 2008. Knowing that an ELF executable's text segment does not change from disk to memory, unless it's some weird self-modifying binary, which the kernel is not… or is it? I quickly ran into trouble and was finding all sorts of code discrepancies between the kernel memory text segment and the vmlinux text segment. This was baffling at first since I had no kernel rootkits installed during these tests. After examining some of the ELF sections in vmlinux, however, I quickly saw some areas that caught my attention:

$ readelf -S vmlinux | grep alt
  [23] .altinstructions  PROGBITS         ffffffff81e64528  01264528...

Using taskverse to see hidden processes

In the Linux kernel, there are a several ways to modify the kernel so that process hiding can work. Since this chapter is not meant to be an exegesis on all kernel rootkits, I will cover only the most commonly used method and then propose a way of detecting it, which is implemented in the taskverse program I made available in 2014.

In Linux, the process IDs are stored as directories within the /proc filesystem; each directory contains a plethora of information about the process. The /bin/ps program does a directory listing in /proc to see which pids are currently running on the system. A directory listing in Linux (such as with ps or ls) uses the sys_getdents64 system call and the filldir64 kernel function. Many kernel rootkits hijack one of these functions (depending on the kernel version) and then insert some code that skips over the directory entry containing the d_name of the hidden process. As a result, the /bin/ps program is unable to find the...

Infected LKMs – kernel drivers

So far, we have covered various types of kernel rootkit infections in memory, but I think that this chapter begs a section dedicated to explaining how kernel drivers can be infected by attackers, and how to go about detecting these infections.

Method 1 for infecting LKM files – symbol hijacking

LKMs are ELF objects. To be more specific, they are ET_REL files (object files). Since they are effectively just relocatable code, the ways to infect them, such as hijacking functions, are more limited. Fortunately, there are some kernel-specific mechanisms that take place during the load time of the ELF kernel object, the process of relocating functions within the LKM, that makes infecting them quite easy. The entire method and reasons for it working are described in this wonderful phrack paper at http://phrack.org/issues/68/11.html, but the general idea is simple:

Inject or link in the parasite code to the kernel module.
Change the symbol value of init_module() to have...

Notes on /dev/kmem and /dev/mem

In the good old days, hackers were able to modify the kernel using the /dev/kmem device file. This file, which gave programmers a raw portal to the kernel memory, was eventually subject to various security patches and removed from many distributions. However, some distros still have it available to read from, which can be a powerful tool for detecting kernel malware, but it is not necessary as long as /proc/kcore is available. Some of the best work ever written on patching the Linux kernel was conceived by Silvio Cesare, which can be seen in his early writings from 1998 and can be found on vxheaven or on this link:

Runtime kernel kmem patching: http://althing.cs.dartmouth.edu/local/vsc07.html

/dev/mem

There have been a number of kernel rootkits that used /dev/mem, namely phalanx and phalanx2, written by Rebel. This device has also undergone a number of security patches. Currently, it is present on all systems for backwards compatibility, but only the first 1 MB of memory is accessible, primarily for legacy tools used by X Windows.

FreeBSD /dev/kmem

On some OSes such as FreeBSD, the /dev/kmem device is still available and is writable by default. There is even an API specifically designed for accessing it, and there's a book called Writing BSD rootkits that demonstrates its abilities.

K-ecfs – kernel ECFS

In the previous chapter, we discussed the ECFS (short for Extended Core File Snapshot) technology. It is worth mentioning near the end of this chapter that I have worked out some code for a kernel-ecfs, which merges vmlinux and /proc/kcore into a kernel-ecfs file. The result is essentially a file similar to /proc/kcore, but one that also has section headers and symbols. In this way, an analyst can easily access any part of the kernel, LKMs, and kernel memory (such as the "vmalloc'd" memory). This code will eventually become publicly available.

A sneak peek of the kernel-ecfs file

Here, we are demonstrating how /proc/kcore has been snapshotted into a file called kcore.img and given a set of ELF section headers:

# ./kcore_ecfs kcore.img

# readelf -S kcore.img
here are 6 section headers, starting at offset 0x60404afc:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  ...

Kernel hacking goodies

The Linux kernel is a vast topic with regards to forensic analysis and reverse engineering. There are many exciting ways to go about instrumenting the kernel for purposes of hacking, reversing, and debugging, and Linux offers its users many entry points into these areas. I have discussed some files and APIs that are useful throughout this chapter, but I will also give a small, condensed list of things that may be of help in your research.

General reverse engineering and debugging

/proc/kcore
/proc/kallsyms
/boot/System.map
/dev/mem (deprecated)
/dev/kmem (deprecated)
GNU debugger (used with kcore)

Advanced kernel hacking/debugging interfaces

Kprobes
Ftrace

Papers mentioned in this chapter

Kprobe instrumentation: http://phrack.org/issues/67/6.html
Runtime kernel kmem patching: http://althing.cs.dartmouth.edu/local/vsc07.html
LKM infection: http://phrack.org/issues/68/11.html
Special sections in Linux binaries: https://lwn.net/Articles/531148/
Kernel Voodoo: http://www.bitlackeys...

Summary

In this final chapter of this book, we stepped out of userland binaries and took a general look at what types of ELF binaries are used in the kernel, and how to utilize them with GDB and /proc/kcore for memory analysis and forensics purposes. We also explained some of the most common Linux kernel rootkit techniques that are used and what methods can be applied to detect them. This small chapter serves only as a primary resource for understanding the fundamentals, but we just listed some excellent resources so that you can continue to expand your knowledge in this area.

The rest of the chapter is locked

You have been reading a chapter from

Learning Linux Binary Analysis

Published in: Feb 2016Publisher: PacktISBN-13: 9781782167105

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Ryan "elfmaster" O'Neill

Ryan "elfmaster" O'Neill is a computer security researcher and software engineer with a background in reverse engineering, software exploitation, security defense, and forensics technologies. He grew up in the computer hacker subculture, the world of EFnet, BBS systems, and remote buffer overflows on systems with an executable stack. He was introduced to system security, exploitation, and virus writing at a young age. His great passion for computer hacking has evolved into a love for software development and professional security research. Ryan has spoken at various computer security conferences, including DEFCON and RuxCon, and also conducts a 2-day ELF binary hacking workshop. He has an extremely fulfilling career and has worked at great companies such as Pikewerks, Leviathan Security Group, and more recently Backtrace as a software engineer. Ryan has not published any other books, but he is well known for some of his papers published in online journals such as Phrack and VXHeaven. Many of his other publications can be found on his website at http://www.bitlackeys.org.
Read more about Ryan "elfmaster" O'Neill

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages