Reader small image

You're reading from  Linux Kernel Programming - Second Edition

Product typeBook
Published inFeb 2024
PublisherPackt
ISBN-139781803232225
Edition2nd Edition
Tools
Right arrow
Author (1)
Kaiwan N. Billimoria
Kaiwan N. Billimoria
author image
Kaiwan N. Billimoria

Kaiwan N. Billimoria taught himself BASIC programming on his dad's IBM PC back in 1983. He was programming in C and Assembly on DOS until he discovered the joys of Unix, and by around 1997, Linux! Kaiwan has worked on many aspects of the Linux system programming stack, including Bash scripting, system programming in C, kernel internals, device drivers, and embedded Linux work. He has actively worked on several commercial/FOSS projects. His contributions include drivers to the mainline Linux OS and many smaller projects hosted on GitHub. His Linux passion feeds well into his passion for teaching these topics to engineers, which he has done for well over two decades now. He's also the author of Hands-On System Programming with Linux, Linux Kernel Programming (and its Part 2 book) and Linux Kernel Debugging. It doesn't hurt that he is a recreational ultrarunner too.
Read more about Kaiwan N. Billimoria

Right arrow

Kernel Synchronization – Part 2

This chapter continues the discussion on the fairly complex topics of kernel synchronization and dealing with concurrency within the kernel in general, from the previous chapter. I suggest that, if you haven’t already, you first read the previous chapter and then continue with this one.

Here, we shall continue our learning with respect to the vast topic of kernel synchronization and handling concurrency when in kernel space. As before, the material is targeted at kernel and/or device driver/module developers. In this chapter, we shall cover the following:

  • Using the atomic_t and refcount_t interfaces
  • Using the RMW atomic operators
  • Using the reader-writer spinlock
  • Understanding CPU caching basics, cache effects, and false sharing
  • Lock-free programming with per-CPU and RCU
  • Lock debugging within the kernel
  • Introducing memory barriers

Technical requirements

The technical requirements remain identical to those for the previous chapter, Chapter 12, Kernel Synchronization – Part 1.

Using the atomic_t and refcount_t interfaces

You’ll recall that in our original simple misc character device driver program’s (available in the Linux Kernel Programming – Part 2 companion volume, in the code here: https://github.com/PacktPublishing/Linux-Kernel-Programming-Part-2/blob/main/ch1/miscdrv_rdwr/miscdrv_rdwr.c) open method (and elsewhere), we defined and manipulated two static global integers, ga and gb:

static int ga, gb = 1;
[...]
ga++; gb--;

By now, it should be obvious to you that the place where we operate on these integers is a potential bug if left as it is: it’s shared writable data (aka shared state) being accessed in a possibly concurrent code path and, therefore, qualifies as a critical section, thus requiring protection against concurrent access. (If this isn’t clear to you, please first read the Critical sections, exclusive execution, and atomicity section in the previous chapter carefully.)

In the previous chapter...

Using the RMW atomic operators

A more advanced set of atomic operators called the RMW APIs is available as well. (Why exactly it’s called RMW and more is explained in the following section.) Among its many uses (we show a list in the upcoming section) is that of performing atomic RMW bitwise operations (safely and indivisibly). As a device driver author operating upon device or peripheral registers, this is indeed something you will very likely find yourself using.

The material in this section assumes you have at least a basic understanding of accessing peripheral device (chip) memory and registers; we have covered this topic in detail in the Linux Kernel Programming – Part 2 companion volume in Chapter 3, Working with Hardware I/O Memory. It’s recommended you first understand this before moving further.

When working with drivers, you’ll typically need to perform bit operations (with the bitwise AND & and bitwise OR | being the...

Using the reader-writer spinlock

Visualize a piece of kernel (or driver) code wherein a large, global data structure – say, a doubly linked circular list with a few thousand nodes or more – is being searched. Now, since this data structure is global (shared and writable), accessing it concurrently constitutes a critical section, and that requires protection.

Assuming a scenario where searching the list is a non-blocking operation, you’d typically use a spinlock to protect the critical section.

A naive approach might propose not using a lock at all, since we’re only reading data within the list, not updating it. But, of course, even a read on shared writable data has to be protected in order to protect against an inadvertent write occurring simultaneously (as you have learned – refer back to the previous chapter if required), thus resulting in a dirty or torn read.

So we conclude that we require the spinlock; we imagine the...

Understanding CPU caching basics, cache effects, and false sharing

Modern processors on multicore symmetric multi-processing (SMP) systems make use of several levels of parallel cache memory within them, in order to provide a very significant speedup when working on memory (we briefly touched upon this in Chapter 8, Kernel Memory Allocation for Module Authors – Part 1, in the Allocating slab memory section). FYI, this kind of computer architecture is often classified as a Multiple Instruction, Single Data (MISD) stream (as instructions can run concurrently in several cores while working upon a single shared data item).

Here’s a purely conceptual diagram (Figure 13.4) showing two CPU cores, each core having two internal caches (Level 1 and Level 2, abbreviated as L1 and L2, respectively), plus a shared or unified L3 cache and the main memory (RAM):

Figure 13.4: Conceptual diagram – 2 CPU cores with internal L1, L2 caches, a shared (unified) L3 cache...

Lock-free programming with per-CPU and RCU

As you have learned, when operating upon shared writable data, the critical section must be protected in some manner. Locking is perhaps the most common technology used to effect this protection. It’s not all rosy, though, as performance can suffer.

To quite intuitively see why, consider a few physical-world analogies to a lock:

  • One is a funnel, with the stem of the funnel – the ‘“critical section” – just wide enough to allow one thread at a time to flow through, and no more.
  • Another is a single toll booth on a wide and busy highway, or a set of traffic lights at a busy intersection.

These analogies help us visualize and understand why locking can cause bottlenecks, slowing performance down to a crawl in some drastic cases. Worse, these adverse effects can be multiplied on high-end (SMP/NUMA) multicore systems (with a few hundred cores); in effect, locking doesn’...

Lock debugging within the kernel

As we’ve learned, locking and synchronization design and implementation can tend to become complex, thus increasing the chances of lurking bugs. The kernel has several means to help debug these difficult situations regarding kernel-level locking issues, deadlock being a primary one.

Just in case you haven’t already, do ensure you’ve first read the basics on synchronization, locking, and deadlock guidelines from the previous chapter (Chapter 12, Kernel Synchronization – Part 1, especially the Critical sections, exclusive execution, and atomicity and Concurrency concerns within the Linux kernel sections).

With any debug scenario, there are different points at which debugging occurs, and thus, perhaps differing tools and techniques that could/should be used. Very broadly speaking, a bug might be noticed, and thus debugged, at a few different points in time (within the Software Development Life Cycle (SDLC...

Introducing memory barriers

Finally, let’s address another concern – that of the memory barrier. What does it mean? Sometimes, a program’s flow becomes unknown to the human programmer, as the microprocessor, the memory controllers, and the compiler can reorder memory reads and writes. In most cases, these “tricks” remain benign and typically optimize performance. But there are cases where this kind of reordering of (memory I/O) instruction sequences should not occur; the original and programmer-intended memory load and store sequences must be honored. What cases? Typically, these:

  • When working across hardware boundaries, such as across individual CPU cores on multicore systems
  • When performing atomic operations
  • When accessing peripheral devices (like performing I/O from a CPU to a peripheral device or vice versa, often via Direct Memory Access (DMA))
  • When working with hardware interrupts

The memory barrier (typically...

Summary

Well, what do you know!? Congratulations, you have done it! You have completed this book!

In this chapter, we continued from the previous chapter in our quest to learn more about kernel synchronization. Here, you learned how to perform locking more efficiently and safely on integers, via both the atomic_t and the newer refcount_t interfaces. Within this, you learned how the typical RMW sequence can be atomically and safely employed in a common activity for driver authors – updating a device’s registers. The reader-writer spinlock, interesting and conceptually useful, although with several caveats, was then covered. You then saw how easy it is to inadvertently create adverse performance issues caused by unfortunate caching side effects, including looking at the false sharing problem and how to avoid it.

A boon to performance – lock-free algorithms and programming techniques – was then covered in some detail, with a focus on understanding...

Questions

As we conclude, the following link provides a list of questions for you to test your knowledge regarding this chapter’s material: https://github.com/PacktPublishing/Linux-Kernel-Programming_2E/blob/main/questions/ch13_qs_assignments.txt. You will find some of the questions answered in the book’s GitHub repo: https://github.com/PacktPublishing/Linux-Kernel-Programming_2E/tree/main/solutions_to_assgn.

Further reading

To help you delve deeper into the subject with useful materials, we provide a rather detailed list of online references and links (and, at times, even books) in a Further reading document in this book’s GitHub repository. The Further reading document is available here: https://github.com/PacktPublishing/Linux-Kernel-Programming_2E/blob/main/Further_Reading.md.

Leave a review!

Enjoying this book? Help readers like you by leaving an Amazon review. Scan the QR code below for a 20% discount code.

*Limited Offer

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Linux Kernel Programming - Second Edition
Published in: Feb 2024Publisher: PacktISBN-13: 9781803232225
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Kaiwan N. Billimoria

Kaiwan N. Billimoria taught himself BASIC programming on his dad's IBM PC back in 1983. He was programming in C and Assembly on DOS until he discovered the joys of Unix, and by around 1997, Linux! Kaiwan has worked on many aspects of the Linux system programming stack, including Bash scripting, system programming in C, kernel internals, device drivers, and embedded Linux work. He has actively worked on several commercial/FOSS projects. His contributions include drivers to the mainline Linux OS and many smaller projects hosted on GitHub. His Linux passion feeds well into his passion for teaching these topics to engineers, which he has done for well over two decades now. He's also the author of Hands-On System Programming with Linux, Linux Kernel Programming (and its Part 2 book) and Linux Kernel Debugging. It doesn't hurt that he is a recreational ultrarunner too.
Read more about Kaiwan N. Billimoria