Reader small image

You're reading from  Linux Kernel Programming - Second Edition

Product typeBook
Published inFeb 2024
PublisherPackt
ISBN-139781803232225
Edition2nd Edition
Tools
Right arrow
Author (1)
Kaiwan N. Billimoria
Kaiwan N. Billimoria
author image
Kaiwan N. Billimoria

Kaiwan N. Billimoria taught himself BASIC programming on his dad's IBM PC back in 1983. He was programming in C and Assembly on DOS until he discovered the joys of Unix, and by around 1997, Linux! Kaiwan has worked on many aspects of the Linux system programming stack, including Bash scripting, system programming in C, kernel internals, device drivers, and embedded Linux work. He has actively worked on several commercial/FOSS projects. His contributions include drivers to the mainline Linux OS and many smaller projects hosted on GitHub. His Linux passion feeds well into his passion for teaching these topics to engineers, which he has done for well over two decades now. He's also the author of Hands-On System Programming with Linux, Linux Kernel Programming (and its Part 2 book) and Linux Kernel Debugging. It doesn't hurt that he is a recreational ultrarunner too.
Read more about Kaiwan N. Billimoria

Right arrow

Preface

This book, in its second edition now, has been explicitly written with a view to helping you learn Linux kernel development in a practical, hands-on fashion, along with the necessary theoretical background to give you a well-rounded view of this vast and interesting topic area. It deliberately focuses on kernel development via the powerful Loadable Kernel Module (LKM) framework; this is because the vast majority of real-world/industry kernel projects and products, which includes device driver development, are done in this manner.

The focus is kept on both working hands-on with, and understanding at a sufficiently deep level, the internals of the Linux OS. In this regard, we cover everything from building the Linux kernel from source to understanding and working with complex topics such as synchronization within the kernel.

To guide you on this exciting journey, we divide this book into three sections. The first section covers the basics – setting up an appropriate workspace for kernel development, building the modern kernel from source, and writing your first kernel module.

The next section, a key one, will help you understand essential kernel internals details; its coverage includes the Linux kernel architecture, the task structure, user - and kernel-mode stacks, and memory management. Memory management is a key and interesting topic – we devote three whole chapters to it (covering the internals to a sufficient extent, and importantly, how exactly to efficiently allocate and free kernel memory). The internal working and deeper details of CPU (task) scheduling on the Linux OS round off this section.

The last section of the book deals with the more advanced topic of kernel synchronization – a necessity for professional design and code on the Linux kernel. We devote two whole chapters to covering key topics here.

The book uses the kernel community’s 6.1 Long Term Support (LTS) Linux kernel. It’s a kernel that will be maintained (both bug and security fixes) from December 2022 right through to December 2026. Moreover, the CIP (Civil Infrastructure Project) has adopted 6.1 as an SLTS (Super LTS) release and plans to maintain it for 10 years, until August 2033! This is a key point, ensuring that this book’s content remains current and valid for years to come!

We very much believe in a hands-on approach: some 40 kernel modules (besides several user apps and shell scripts, double that of the first edition!) in this book’s GitHub repository make the learning come alive, making it fun, practical, interesting, and useful.

We highly recommend you also make use of this book’s companion guide, Linux Kernel Programming Part 2 – Char Device Drivers and Kernel Synchronization: Create user-kernel interfaces, work with peripheral I/O, and handle hardware interrupts. It’s an excellent industry-aligned beginner’s guide to writing misc character drivers, performing I/O on peripheral chip memory, and handling hardware interrupts. You can get this book for free along with your print copy; alternately, you can also find this eBook in the GitHub repository at https://github.com/PacktPublishing/Linux-Kernel-Programming/tree/master/Linux-Kernel-Programming-(Part-2).

We really hope you learn from and enjoy this book. Happy reading!

Who this book is for

This book is primarily for those of you beginning your journey in the vast arena of learning and understanding modern Linux kernel architecture and internals, Linux kernel module development and, to some extent, Linux device driver development. It’s also very much targeted at those of you who have already been working on Linux modules and/or drivers, who wish to gain a much deeper, well-structured understanding of Linux kernel architecture, memory management, task scheduling, cgroups, and synchronization. This level of knowledge about the underlying OS, covered in a properly structured manner, will help you no end when you face difficult-to-debug real-world situations.

What’s been added in the second edition?

A pretty huge amount of new material has been added into this, the Second Edition of the Linux Kernel Programming book. As well, being based on the very recent (as of this writing) 6.1 LTS release, its information and even code will remain industry-relevant for many, many years to come.

Here’s a quick chapter-wise summarization of what’s new in this second edition:

  • Materials updated for the 6.1 LTS kernel, maintained until December 2026, and until August 2033 via the CLP (6.1 SLTS)!
  • Updated, new, and working code for the 6.1 LTS kernel
  • Several new info-rich sections added to most chapters, many new diagrams, and new code examples to help explain concepts better
  • Chapter 1, Linux Kernel Programming – A Quick Introduction
    • Introduction to the book
  • Chapter 2, Building the 6.x Linux Kernel from Source – Part 1
    • The new LTS kernel lifetime mandate
    • More details on the kernel’s Kconfig+Kbuild system
    • Updated approaches on configuring the kernel
  • Chapter 3, Building the 6.x Linux Kernel from Source – Part 2
    • More details on the initramfs (initrd) image
    • Cross-compiling the kernel on an x86_64 host to an AArch64 target
  • Chapter 4, Writing Your First Kernel Module – Part 1
    • new-ish printk indexing feature covered
    • Powerful kernel dynamic debug feature introduced
    • Rate-limiting macros updated (deprecated ones not used)
  • Chapter 5, Writing Your First Kernel Module – Part 2
    • A better, ‘better’ Makefile (v0.2)
  • Chapter 6, Kernel Internals Essentials – Processes and Threads
    • New linked list demo module
  • Chapter 7, Memory Management Internals – Essentials
    • New coverage on how address translation works (including diagrams)
  • Chapter 8, Kernel Memory Allocation for Module Authors – Part 1
    • Coverage on using the “exact” page allocator API pair
    • FAQs regarding (slab) memory usage and their answers
    • The graphing demo (via gnuplot) is now automated and even saved to an image file, via a helper script
    • Finding internal fragmentation (wastage) within the kernel
  • Chapter 9, Kernel Memory Allocation for Module Authors – Part 2
    • Extracting useful information regarding slab caches
    • A word on slab shrinkers
    • Better coverage on the OOM killer (and systemd-oomd) and how it’s triggered; includes a flowchart depicting demand-paging and possible OOM killer invocation
    • Better coverage on kernel page reclaim, as well as the new MGLRU and DAMON technologies
  • Chapter 10, The CPU Scheduler – Part 1
    • New coverage on CFS scheduling period and timeslice. Coverage on the thread_info structure as well
    • New: the preempt dynamic feature
    • Enhanced coverage on exactly how and when schedule() is invoked
  • Chapter 11, The CPU Scheduler – Part 2
    • Much more depth in the powerful cgroups (v2) coverage plus an interesting script to let you explore its content
    • Leveraging the cgroups v2 CPU controller via both systemd and manually to perform CPU bandwidth allocation
    • A note on Google’s ghOSt OS
  • Chapter 12, Kernel Synchronization – Part 1
    • A new intro to the LKMM (Linux Kernel Memory Model)
    • More on locking plus deadlock avoidance guidelines
  • Chapter 13, Kernel Synchronization – Part 2
    • Expanded coverage on CPU caching and cache effects
    • New coverage on the powerful lock-free RCU synchronization technology
  • Online Chapter, Kernel Workspace Setup
    • Fixed errors in package names and versions
    • Ubuntu-based helper script that auto-installs all required packages

Most (if not all) earlier code errors, typos, and URLs are now fixed, based on prompt feedback, raising Issues/PRs on the book’s GitHub repo, from you, our wonderful readers!

What this book covers

Chapter 1, Linux Kernel Programming – A Quick Introduction, briefs you about the exciting journey in the sections of the book, which cover everything from building the Linux kernel from source to understanding and working with complex topics such as synchronization within the kernel.

Chapter 2, Building the 6.x Linux Kernel from Source – Part 1, is the first part of explaining how to build the modern Linux kernel from scratch with its source code. In this part, you will first be given necessary background information – the kernel version nomenclature, the different source trees, and the layout of the kernel source. Next, you will be shown in detail how exactly to download a stable vanilla Linux kernel source tree onto your Linux VM (Virtual Machine). We shall then learn a little regarding the layout of the kernel source code, getting, in effect, a “10,000-foot view” of the kernel code base. The actual work of extracting and configuring the Linux kernel then follows. Creating and using a custom menu entry for kernel configuration is also explained in detail.

Chapter 3, Building the 6.x Linux Kernel from Source – Part 2, is the second part on performing kernel builds from source code. In this part, you will continue from the previous chapter, now actually building the kernel, installing kernel modules, understanding what exactly the initramfs (initrd) image is and how to generate it, and setting up the bootloader (for the x86_64). Also, as a valuable add-on, this chapter then explains how to cross-compile the kernel for a typical embedded ARM target (using the popular Raspberry Pi 4 64-bit as a target device). Several tips and tricks on kernel builds, and even kernel security (hardening) tips, are detailed.

Chapter 4, Writing Your First Kernel Module – Part 1, is the first of two parts that cover a fundamental aspect of Linux kernel development – the LKM framework and how it is to be understood and used by you, the “module user,” the kernel module or device driver programmer. It covers the basics of the Linux kernel architecture and then, in great detail, every step involved in writing a simple “Hello, world” kernel module, compiling, inserting, checking, and removing it from kernel space.

We also cover kernel logging via the ubiquitous printk API in detail. This edition also covers printk indexing and introduces the powerful dynamic debug feature.

Chapter 5, Writing Your First Kernel Module – Part 2, is the second part that covers the LKM framework. Here, we begin with something critical – learning how to use a “better” Makefile, which will help you generate more robust code (this so-called ‘better’ Makefile helps by having several targets for code-checking, code-style correction, static analysis, and so on. This edition has a superior version of it). We then show in detail the steps to successfully cross-compile a kernel module for an alternate architecture, how to emulate “library-like” code in the kernel (via both the linking and module-stacking approaches), and how to pass parameters to your kernel module. Additional topics include how to perform auto-loading of modules at boot, important security guidelines, and some information on the kernel coding style and upstream contribution. Several example kernel modules make the learning more interesting.

Chapter 6, Kernel Internals Essentials – Processes and Threads, delves into some essential kernel internals topics. We begin with what is meant by the execution of kernel code in process and interrupt contexts, and minimal but required coverage of the process user virtual address space (VAS) layout. This sets the stage for you; you’ll then learn about Linux kernel architecture in more depth, focusing on the organization of process/thread task structures and their corresponding stacks – user- and kernel-mode. We then show you more on the kernel task structure (a “root” data structure), how to practically glean information from it (via the powerful ‘current’ macro), and even how to iterate over various (task) lists (there’s sample code too!). Several kernel modules make the topic come alive.

Chapter 7, Memory Management Internals – Essentials, a key chapter, delves into essential internals of the Linux memory management subsystem, to the level of detail required for the typical module author or driver developer. This coverage is thus necessarily more theoretical in nature; nevertheless, the knowledge gained here is crucial to you, the kernel developer, both for deep understanding and usage of appropriate kernel memory APIs, as well as for performing meaningful debugging at the level of the kernel. We cover the VM split (and how it’s defined on various actual architectures), gaining deep insight into the user VAS as well as the kernel VAS (our procmap utility will prove to be an eye-opener here!). We also cover more on how address translation works. We then briefly delve into the security technique of memory layout randomization ([K]ASLR), and end this chapter with a discussion on physical memory organization within Linux.

Chapter 8, Kernel Memory Allocation for Module Authors –Part 1, gets our hands dirty with the kernel memory allocation (and, obviously, deallocation) APIs. You will first learn about the two allocation “layers” within Linux – the slab allocator that’s layered above the kernel memory allocation “engine,” the page allocator (or BSA). We shall briefly learn about the underpinnings of the page allocator algorithm and its “freelist” data structure; this information is valuable when deciding which layer to use. Next, we dive straight into the hands-on work of learning about the usage of these key APIs. The ideas behind the slab allocator (or slab cache) and the primary kernel allocator APIs – the kzalloc()/kfree() pair (and friends) – are covered. Importantly, the size limitations, downsides, and caveats when using these common APIs are covered in a lot of detail as well.

Also, especially useful for driver authors, we cover the kernel’s modern resource-managed memory allocation APIs (the devm_*() routines). Finding where internal fragmentation (wastage) occurs is another interesting area we delve into.

Chapter 9, Kernel Memory Allocation for Module Authors– Part 2, goes further, in a logical fashion, from the previous chapter. Here, you will learn how to create custom slab caches (useful for high-frequency (de)allocations for, say, your custom driver). Next, you’ll learn how to extract useful information regarding slab caches as well as understanding slab shrinkers (new in this edition). We then move onto understanding and using the vmalloc() API (and friends). Very importantly, having covered many APIs for kernel memory (de)allocation, you will now learn how to pick and choose an appropriate API given the real-world situation you find yourself in. This chapter is rounded off with important coverage of the kernel’s memory reclamation technologies and the dreaded Out Of Memory (OOM) “killer” framework. Understanding OOM and related areas will also lead to a much deeper understanding of how user space memory allocation really works, via the demand paging technique. This edition has more and better coverage of kernel page reclaim, as well as the new MGLRU and DAMON technologies.

Chapter 10, The CPU Scheduler – Part 1, the first of two chapters on this topic, covers a useful mix of theory and practice regarding CPU (task) scheduling on the Linux OS. The minimal necessary theoretical background on what the KSE (kernel schedulable entity) is – it’s the thread! – and available kernel scheduling policies, are some of the initially covered topics. Next, we cover how to visualize the flow of a thread via tools like perf. Sufficient kernel internal details on CPU scheduling are then covered to have you understand how task scheduling on the modern Linux OS works. Along the way, you will learn about thread scheduling attributes (policy and real-time priority) as well. This edition includes new coverage on CFS scheduling periods/timeslices, enhanced coverage on exactly how and when the core scheduler code is invoked, and coverage of the new preempt dynamic feature.

Chapter 11, The CPU Scheduler – Part 2, the second part on CPU (task) scheduling, continues to cover the topic in more depth. Here, you learn about the CPU affinity mask and how to query/set it, controlling scheduling policy and priority on a per-thread basis – such powerful features! We then come to a key and very powerful Linux OS feature – control groups (cgroups). We understand this feature along with learning how to practically explore it (a custom script is also built). We further learn the role the modern systemd framework plays with cgroups. An interesting example on controlling CPU bandwidth allocation via cgroups v2 is then seen, from different angles. Can you run Linux as an RTOS? Indeed you can! We cover an introduction to this other interesting area…

Chapter 12, Kernel Synchronization – Part 1, first covers the really key concepts regarding critical sections, atomicity, data races (from the LKMM point of view), what a lock conceptually achieves, and, very importantly, the ‘why’ of all this. We then cover concurrency concerns when working within the Linux kernel; this moves us naturally on to important locking guidelines, what deadlock means, and key approaches to preventing deadlock. Two of the most popular kernel locking technologies – the mutex lock and the spinlock – are then discussed in depth along with several (simple device driver based) code examples. We also point out how to work with spinlocks in interrupt contexts and close the chapter with common locking mistakes (to avoid) and deadlock-avoidance guildelines.

Chapter 13, Kernel Synchronization – Part 2, continues the journey on kernel synchronization. Here, you’ll learn about key locking optimizations – using lightweight atomic and (the more recent) refcount operators to safely operate on integers, RMW bit operators to safely perform bit ops, and the usage of the reader-writer spinlock over the regular one. What exactly the CPU caches are and the inherent risks involved when using them, such as cache “false sharing,” are then discussed. We then get into another key topic – lock-free programming techniques with an emphasis on per-CPU data and Read Copy Update (RCU) lock-free technologies. Several module examples illustrate the concepts! A critical topic – lock debugging techniques, including the usage of the kernel’s powerful “lockdep” lock validator – is then covered. The chapter is rounded off with a brief look at locking statistics and memory barriers.

Online Chapter – Kernel Workspace Setup

The online chapter on Kernel Workspace Setup, published online, guides you on setting up a full-fledged Linux kernel development workspace (typically, as a fully virtualized guest system). You will learn how to install all required software packages on it. (In this edition, we even provide an Ubuntu-based helper script that auto-installs all required packages.) You will also learn about several other open-source projects that will be useful on your journey to becoming a professional kernel/driver developer. Once this chapter is done, you will be ready to build a Linux kernel as well as to start writing and testing kernel code (via the loadable kernel module framework). In our view, it’s very important for you to actually use this book in a hands-on fashion, trying out and experimenting with code. The best way to learn something is to do so empirically – not taking anyone’s word on anything at all, but by trying it out and experiencing it for yourself. This chapter has been published online; do read it, here:

You can read more about the chapter online using the following link: http://www.packtpub.com/sites/default/files/downloads/9781803232225_Online_Chapter.pdf.

To get the most out of this book

To get the most out of this book, we expect the following:

  • You need to know your way around a Linux system, on the command line (the shell).
  • You need to know the C programming language.
  • It’s not mandatory, but experience with Linux system programming concepts and technologies will greatly help.

The details on hardware and software requirements, as well as their installation, are covered completely and in depth in Online Chapter, Kernel Workspace Setup. It’s critical that you read it in detail and follow the instructions therein.

Also, we have tested all the code in this book (it has its own GitHub repository) on these platforms:

  • x86_64 Ubuntu 22.04 LTS and guest OS (running on Oracle VirtualBox 7.0)
  • x86_64 Ubuntu 23.04 LTS guest OS (running on Oracle VirtualBox 7.0)
  • x86_64 Fedora 38 (and 39) on a native (laptop) system
  • ARM Raspberry Pi 4 Model B (64-bit, running both its “distro” kernel as well as our custom 6.1 kernel); lightly tested

We assume that, when running Linux as a guest (VM), the host system is either Windows 10 or later (of course, even Windows 7 will work), a recent Linux distribution (for example, Ubuntu or Fedora), or even macOS.

If you are using the digital version of this book, we advise you to type the code yourself or, much better, access the code via the GitHub repository (link available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

I strongly recommend that you follow the empirical approach: not taking anyone’s word on anything at all, but trying it out and experiencing it for yourself. Hence, this book gives you many hands-on experiments and kernel code examples that you can and must try out yourself; this will greatly aid you in making real progress, deepening your understanding of the various aspects of Linux kernel development.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Linux-Kernel-Programming_2E. If there’s an update to the code, it will be updated on the existing GitHub repository. (So be sure to regularly do a “git pull” as well to stay up to date.)

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://packt.link/gbp/9781803232225.

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “The ioremap() API returns a KVA of the void * type (since it’s an address location).”

A block of code is set as follows:

static int __init miscdrv_init(void)
{
    int ret;
    struct device *dev;

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

#if LINUX_VERSION_CODE < KERNEL_VERSION(5, 8, 0)
    vrx = __vmalloc(42 * PAGE_SIZE, GFP_KERNEL, PAGE_KERNEL_RO);
    if (!vrx) {
        pr_warn("__vmalloc failed\n");
        goto err_out5;
    }
[ … ]

Any command-line input or output is written as follows:

pi@raspberrypi:~ $ sudo cat /proc/iomem

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in the text like this. Here is an example: “Select System info from the Administration panel.”

Warnings or important notes appear like this.

Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: Email feedback@packtpub.com and mention the book’s title in the subject of your message. If you have questions about any aspect of this book, please email us at questions@packtpub.com.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you reported this to us. Please visit http://www.packtpub.com/submit-errata, click Submit Errata, and fill in the form. We shall strive to update all reported errors/omissions in a Known Errata section on this book’s GitHub repo as well, allowing you to quickly spot them.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packtpub.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit http://authors.packtpub.com.

Share your thoughts

Once you’ve read Linux Kernel Programming, Second Edition, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Download a free PDF copy of this book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere?

Is your eBook purchase not compatible with the device of your choice?

Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.

Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application (using the book’s GitHub repo is even better!). 

The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily

Follow these simple steps to get the benefits:

  1. Scan the QR code or visit the link below

https://packt.link/free-ebook/9781803232225

  1. Submit your proof of purchase
  2. That’s it! We’ll send your free PDF and other benefits to your email directly
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Linux Kernel Programming - Second Edition
Published in: Feb 2024Publisher: PacktISBN-13: 9781803232225
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Kaiwan N. Billimoria

Kaiwan N. Billimoria taught himself BASIC programming on his dad's IBM PC back in 1983. He was programming in C and Assembly on DOS until he discovered the joys of Unix, and by around 1997, Linux! Kaiwan has worked on many aspects of the Linux system programming stack, including Bash scripting, system programming in C, kernel internals, device drivers, and embedded Linux work. He has actively worked on several commercial/FOSS projects. His contributions include drivers to the mainline Linux OS and many smaller projects hosted on GitHub. His Linux passion feeds well into his passion for teaching these topics to engineers, which he has done for well over two decades now. He's also the author of Hands-On System Programming with Linux, Linux Kernel Programming (and its Part 2 book) and Linux Kernel Debugging. It doesn't hurt that he is a recreational ultrarunner too.
Read more about Kaiwan N. Billimoria