Reader small image

You're reading from  Linux Kernel Programming - Second Edition

Product typeBook
Published inFeb 2024
PublisherPackt
ISBN-139781803232225
Edition2nd Edition
Tools
Right arrow
Author (1)
Kaiwan N. Billimoria
Kaiwan N. Billimoria
author image
Kaiwan N. Billimoria

Kaiwan N. Billimoria taught himself BASIC programming on his dad's IBM PC back in 1983. He was programming in C and Assembly on DOS until he discovered the joys of Unix, and by around 1997, Linux! Kaiwan has worked on many aspects of the Linux system programming stack, including Bash scripting, system programming in C, kernel internals, device drivers, and embedded Linux work. He has actively worked on several commercial/FOSS projects. His contributions include drivers to the mainline Linux OS and many smaller projects hosted on GitHub. His Linux passion feeds well into his passion for teaching these topics to engineers, which he has done for well over two decades now. He's also the author of Hands-On System Programming with Linux, Linux Kernel Programming (and its Part 2 book) and Linux Kernel Debugging. It doesn't hurt that he is a recreational ultrarunner too.
Read more about Kaiwan N. Billimoria

Right arrow

Kernel Memory Allocation for Module Authors – Part 2

The previous chapter covered the basics (and a lot more!) of using the available APIs for memory allocation via both the page allocator (PA) or Buddy System Allocator (BSA) and the slab allocators within the kernel. In this chapter, we will delve further into this large and interesting topic. We will cover the creation of custom slab caches, the vmalloc interfaces, and, very importantly, given the wealth of choice, which APIs to use in which situation. We shall then delve into some key kernel internal details regarding memory reclamation, the dreaded Out of Memory (OOM) killer, and demand paging.

These areas tend to be important to understand when working with kernel modules, especially with device drivers. A Linux system project’s sudden crash with merely a Killed message on the console requires some explanation, yes!? The OOM killer’s likely the sweet chap behind this...

Briefly, within this chapter...

Technical requirements

I assume that you have gone through Online Chapter, Kernel Workspace Setup, and have appropriately prepared a guest VM (or a native system) running Ubuntu 22.04 LTS (or a later stable release) and installed all the required packages. If not, I highly recommend you do this first.

Also, the last section of this chapter deliberately runs a very memory-intensive app, so intensive that the kernel will take some drastic action (killing off some process(es))! I highly recommend you try out stuff like this on a safe, isolated system, preferably a Linux test VM (with no important data on it).

To get the most out of this book, I strongly recommend you first set up the workspace environment, including cloning this book’s GitHub repository for the code, and work on it in a hands-on fashion. The GitHub repository can be found at https://github.com/PacktPublishing/Linux-Kernel-Programming_2E.

Creating a custom slab cache

As explained in detail in the previous chapter, a key design concept behind slab caches is the powerful idea of object caching. By caching frequently used objects – data structures, really – the memory alloca­tion/free work for those objects are much quicker and thus overall performance receives a boost.

So, think about this: what if we’re writing a driver and within it, we notice that a certain data structure (an object) is very frequently allocated and freed? Normally, we would use the usual kzalloc() (or kmalloc()) followed by the kfree() APIs to allocate and free this object. Some good news: the Linux kernel sufficiently exposes the slab layer API to us, the module/driver authors, allowing us to create custom slab caches. In this section, you’ll learn how you can leverage this powerful feature.

Creating and using a custom slab cache within a kernel module

In this section, we will create, use, and subsequently...

Debugging kernel memory issues – a quick mention

Memory corruption is unfortunately a very common root cause of bugs, and being able to debug them is a key skill. Well, unfortunately (and with apologies), we don’t cover kernel memory debugging in this book due to two primary reasons:

  1. This isn’t a book on kernel debugging.
  2. Topics on kernel debugging have been covered in depth in my recent (up to date as of the 5.10 LTS kernel) Linux Kernel Debugging (LKD) book, including two whole chapters of very detailed coverage on debugging kernel memory issues.

Nevertheless, in this book, I consider it my duty to at least mention the various tools and approaches one typically employs when debugging kernel memory issues. You would do well to gain familiarity with the powerful dynamic (runtime) analysis frameworks/tools that are mentioned here:

  • The Sanitizer toolset:
    • KASAN (the Kernel Address Sanitizer): Available for x86_64 and...

Understanding and using the kernel vmalloc() API

As we learned in the previous chapter, ultimately there is just one engine for memory allocation within the kernel – the page (or buddy system) allocator. Layered on top is the slab allocator (or slab cache) machinery. In addition to these two layers, there is another completely virtual address space within the kernel’s virtual address space from where virtual pages can be allocated at will – this is called the kernel vmalloc region.

Within the kernel segment or VAS is the vmalloc address space, aka the vmalloc region, extending from VMALLOC_START to VMALLOC_END-1 (the precise addresses and the space available are arch-dependant; incidentally, we covered all this in some detail in Chapter 7, Memory Management Internals – Essentials, under the Examining the kernel segment section). It’s a completely virtual region to begin with, that is, its virtual pages are initially not mapped to any physical...

Memory allocation in the kernel – which APIs to use when

A really quick summary of what we have learned so far: the kernel’s underlying engine for memory allocation (and freeing) is called the page (or buddy system) allocator. Ultimately, every single memory allocation (and subsequent free) goes through this layer. It has its share of problems, though, the chief one being internal fragmentation or wastage (due to its minimum granularity being a page or multiple pages). Thus, we have the slab allocator (or slab cache) layered above it, providing the power of object caching and caching fragments of a page (helping alleviate the page allocator’s wastage issues). Also, don’t forget that you can create your own custom slab caches. Further, the kernel has a vmalloc region and APIs to allocate large virtual memory swathes from within it.

With this information in mind, let’s move along. To understand which API to use when, let’s first look at...

Memory reclaim – a key kernel housekeeping task

As you will be aware, the kernel tries, for optimal performance, to keep the working set of memory pages as high up as possible in the memory pyramid (or hierarchy).

The so-called memory pyramid (or memory hierarchy) on a system consists of (in order, from smallest size but fastest speed to largest size but slowest speed): CPU registers, CPU caches (LI, L2, L3, ...), RAM, and swap (raw disk/flash/SSD partition). In the following discussion, we ignore CPU registers as their size is minuscule.

In a modern processor, as code executes and data is worked upon, the processor uses its hardware caches (L1, L2, and so on) to hold the current working set of pages within its multilevel CPU instruction and data caches. But of course, CPU cache memory is very limited, thus it will soon run out, causing the memory to spill over into the next hierarchical level – RAM. On modern systems, even many embedded ones, there...

Stayin’ alive – the OOM killer

Now that we’ve covered background details regarding kernel memory management, particularly the reclaiming of free memory, you’re well placed to understand what the Out of Memory (OOM) killer kernel component is, how to work with it, even how to deliberately invoke it and, to an extent, control it.

Let’s revisit a key point, that of memory (RAM and swap) running short. Let’s play devil’s advocate: what if RAM runs low and all this memory reclamation work (which we just covered in the previous section) simply doesn’t help, and memory pressure keeps increasing to the point where the complete memory pyramid is exhausted, where a kernel allocation of even a few pages fails (or infinitely retries, which, frankly, is just as useless, perhaps worse)? In other words, what if all CPU caches, RAM, and swap are (almost completely) full!? Well, most systems just die at this point (actually, they don’...

Summary

In this chapter, we continued where we left off in the previous chapter. We covered, in a good amount of detail, how you can create and use your own custom slab caches (useful when your driver or module very frequently allocates and frees a certain data structure). We then provided an overview of available kernel debug techniques for debugging memory issues. Next, we learned about and used the kernel vmalloc() API (and friends). With the wealth of memory APIs available, how do you select which one to use in a given situation? We covered this important concern with a useful decision chart and table. We then delved into an understanding of the kernel page reclaim procedures; this discussion covered the zone watermarks, the kswapd kernel thread(s), the new MG-LRU lists, and the DAMON data access monitoring technology.

We then went into what exactly the kernel’s dreaded OOM killer (and the systemd-oomd daemon) component is and how to work with it.

As I have mentioned...

Questions

As we conclude, here is a list of questions for you to test your knowledge regarding this chapter’s material: https://github.com/PacktPublishing/Linux-Kernel-Programming_2E/blob/main/questions/ch9_qs_assignments.txt. You will find some of the questions answered in the book’s GitHub repo: https://github.com/PacktPublishing/Linux-Kernel-Programming_2E/tree/master/solutions_to_assgn.

Further reading

To help you delve deeper into the subject with useful materials, we provide a rather detailed list of online references and links (and, at times, even books) in a Further reading document in this book’s GitHub repository. The Further reading document is available here: https://github.com/PacktPublishing/Linux-Kernel-Programming/blob/master/Further_Reading.md.

Learn more on Discord

To join the Discord community for this book – where you can share feedback, ask questions to the author, and learn about new releases – follow the QR code below:

https://packt.link/SecNet

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Linux Kernel Programming - Second Edition
Published in: Feb 2024Publisher: PacktISBN-13: 9781803232225
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Kaiwan N. Billimoria

Kaiwan N. Billimoria taught himself BASIC programming on his dad's IBM PC back in 1983. He was programming in C and Assembly on DOS until he discovered the joys of Unix, and by around 1997, Linux! Kaiwan has worked on many aspects of the Linux system programming stack, including Bash scripting, system programming in C, kernel internals, device drivers, and embedded Linux work. He has actively worked on several commercial/FOSS projects. His contributions include drivers to the mainline Linux OS and many smaller projects hosted on GitHub. His Linux passion feeds well into his passion for teaching these topics to engineers, which he has done for well over two decades now. He's also the author of Hands-On System Programming with Linux, Linux Kernel Programming (and its Part 2 book) and Linux Kernel Debugging. It doesn't hurt that he is a recreational ultrarunner too.
Read more about Kaiwan N. Billimoria