Reader small image

You're reading from  Mastering Embedded Linux Programming - Third Edition

Product typeBook
Published inMay 2021
PublisherPackt
ISBN-139781789530384
Edition3rd Edition
Right arrow
Authors (2):
Frank Vasquez
Frank Vasquez
author image
Frank Vasquez

Frank Vasquez is an independent software consultant specializing in consumer electronics. He has over a decade of experience designing and building embedded Linux systems. During that time, he has shipped numerous devices including a rackmount DSP audio server, a diver-held sonar camcorder, and a consumer IoT hotspot. Before his career as an embedded Linux engineer, Frank was a database kernel developer at IBM where he worked on DB2. He lives in Silicon Valley.
Read more about Frank Vasquez

Chris Simmonds
Chris Simmonds
author image
Chris Simmonds

Chris Simmonds is a software consultant and trainer living in southern England. He has almost two decades of experience in designing and building open-source embedded systems. He is the founder and chief consultant at 2net Ltd, which provides professional training and mentoring services in embedded Linux, Linux device drivers, and Android platform development. He has trained engineers at many of the biggest companies in the embedded world, including ARM, Qualcomm, Intel, Ericsson, and General Dynamics. He is a frequent presenter at open source and embedded conferences, including the Embedded Linux Conference and Embedded World.
Read more about Chris Simmonds

View More author details
Right arrow

Chapter 21: Real-Time Programming

Much of the interaction between a computer system and the real world happens in real time, and so this is an important topic for developers of embedded systems. I have touched on real-time programming in several places so far: in Chapter 17, Learning About Processes and Threads, we looked at scheduling policies and priority inversion, and in Chapter 18, Managing Memory, I described the problems with page faults and the need for memory locking. Now it is time to bring these topics together and look at real-time programming in some depth.

In this chapter, I will begin with a discussion about the characteristics of real-time systems, and then consider the implications for system design, at both the application and kernel levels. I will describe the real-time PREEMPT_RT kernel patch, and show how to get it and apply it to a mainline kernel. The final sections will describe how to characterize system latencies using two tools: cyclictest and Ftrace.

...

Technical requirements

To follow along with the examples, make sure you have the following:

  • A Linux-based host system with a minimum of 60 GB of available disk space
  • Buildroot 2020.02.9 LTS release
  • Yocto 3.1 (Dunfell) LTS release
  • Etcher for Linux
  • A microSD card reader and card
  • BeagleBone Black
  • A 5 V 1 A DC power supply
  • An Ethernet cable and port for network connectivity

You should have already installed the 2020.02.9 LTS release of Buildroot for Chapter 6, Selecting a Build System. If you have not, then refer to the System requirements section of
The Buildroot user manual (https://buildroot.org/downloads/manual/manual.html) before installing Buildroot on your Linux host according to the instructions from Chapter 6.

You should have already installed the 3.1 (Dunfell) LTS release of Yocto for Chapter 6, Selecting a Build System. If you have not, then refer to the Compatible Linux Distribution and Build Host Packages sections of the Yocto...

What is real time?

The nature of real-time programming is one of the subjects that software engineers love to discuss at length, often giving a range of contradictory definitions. I will begin by setting out what I think is important about real time.

A task is a real-time task if it has to complete before a certain point in time, known as the deadline. The distinction between real-time and non-real-time tasks is shown by considering what happens when you play an audio stream on your computer while compiling the Linux kernel. The first is a real-time task because there is a constant stream of data arriving at the audio driver, and blocks of audio samples have to be written to the audio interface at the playback rate. Meanwhile, the compilation is not real-time because there is no deadline. You simply want it to complete as soon as possible; whether it takes 10 seconds or 10 minutes does not affect the quality of the kernel binaries.

The other important thing to consider is the...

Identifying sources of non-determinism

Fundamentally, real-time programming is about making sure that the threads controlling the output in real time are scheduled when needed and so can complete the job before the deadline. Anything that prevents this is a problem. Here are some problem areas:

  • Scheduling: Real-time threads must be scheduled before others, and so they must have a real-time policy, SCHED_FIFO or SCHED_RR. Additionally, they should have priorities assigned in descending order, starting with the one with the shortest deadline, according to the theory of rate monotonic analysis that I described in Chapter 17, Learning About Processes and Threads.
  • Scheduling latency: The kernel must be able to reschedule as soon as an event such as an interrupt or timer occurs, and not be subject to unbounded delays. Reducing scheduling latency is a key topic later on in this chapter.
  • Priority inversion: This is a consequence of priority-based scheduling, which leads to...

Understanding scheduling latency

Real-time threads need to be scheduled as soon as they have something to do. However, even if there are no other threads of the same or higher priority, there is always a delay from the point at which the wake-up event occurs—an interrupt or system timer—to the time that the thread starts to run. This is called scheduling latency. It can be broken down into several components, as shown in the following diagram:

Figure 21.1 – Scheduling latency

Figure 21.1 – Scheduling latency

Firstly, there is the hardware interrupt latency from the point at which an interrupt is asserted until the interrupt service routine (ISR) begins to run. A small part of this is the delay in the interrupt hardware itself, but the biggest problem is due to interrupts being disabled in software. Minimizing this IRQ off time is important.

The next is interrupt latency, which is the length of time until the ISR has serviced the interrupt and woken up any threads...

Kernel preemption

Preemption latency occurs because it is not always safe or desirable to preempt the current thread of execution and call the scheduler. Mainline Linux has three settings for preemption, selected via the Kernel Features | Preemption Model menu:

  • CONFIG_PREEMPT_NONE: No preemption.
  • CONFIG_PREEMPT_VOLUNTARY: This enables additional checks for requests
    for preemption.
  • CONFIG_PREEMPT: This allows the kernel to be preempted.

With preemption set to none, kernel code will continue without rescheduling until it either returns via a syscall back to user space, where preemption is always allowed, or it encounters a sleeping wait that stops the current thread. Since it reduces the number of transitions between the kernel and user space and may reduce the total number of context switches, this option results in the highest throughput at the expense of large preemption latencies. It is the default for servers and some desktop kernels where throughput is more...

Preemptible kernel locks

Making the majority of kernel locks preemptible is the most intrusive change that PREEMPT_RT makes, and this code remains outside of the mainline kernel.

The problem occurs with spin locks, which are used for much of the kernel locking. A spin lock is a busy-wait mutex that does not require a context switch in the contended case, and so it is very efficient as long as the lock is held for a short time. Ideally, they should be locked for less than the time it would take to reschedule twice. The following diagram shows threads running on two different CPUs contending the same spin lock. CPU 0 gets it first, forcing CPU 1 to spin, waiting until it is unlocked:

Figure 21.3 – Spin lock

Figure 21.3 – Spin lock

The thread that holds the spin lock cannot be preempted since doing so may make the new thread enter the same code and deadlock when it tries to lock the same spin lock. Consequently, in mainline Linux, locking a spin lock disables kernel preemption...

High-resolution timers

Timer resolution is important if you have precise timing requirements, which is typical for real-time applications. The default timer in Linux is a clock that runs at a configurable rate, typically 100 Hz for embedded systems and 250 Hz for servers and desktops. The interval between two timer ticks is known as a jiffy and, in the examples given previously, is 10 milliseconds on an embedded SoC and 4 milliseconds on a server.

Linux gained more accurate timers from the real-time kernel project in version 2.6.18, and now they are available on all platforms, provided that there is a high-resolution timer source and device driver for it—which is almost always the case. You need to configure the kernel with CONFIG_HIGH_RES_TIMERS=y.

With this enabled, all the kernel and user space clocks will be accurate down to the granularity of the underlying hardware. Finding the actual clock granularity is difficult. The obvious answer is the value provided by clock_getres...

Avoiding page faults

A page fault occurs when an application reads or writes to memory that is not committed to physical memory. It is impossible (or very hard) to predict when a page fault will happen, so they are another source of non-determinism in computers.

Fortunately, there is a function that allows you to commit all the memory used by the process and lock it down so that it cannot cause a page fault. It is mlockall(2). These are its two flags:

  • MCL_CURRENT: This locks all pages currently mapped.
  • MCL_FUTURE: This locks pages that are mapped in later.

You usually call mlockall during the startup of the application with both flags set to lock all current and future memory mappings.

Tip

MCL_FUTURE is not magic, in that there will still be a non-deterministic delay when allocating or freeing heap memory using malloc()/free()
or mmap(). Such operations are best done at startup and not in the main control loops.

Memory allocated on the stack is trickier...

Interrupt shielding

Using threaded interrupt handlers helps mitigate interrupt overhead by running some threads at a higher priority than interrupt handlers that do not impact real-time tasks. If you are using a multi-core processor, you can take a different approach and shield one or more cores from processing interrupts completely, allowing them to be dedicated to real-time tasks instead. This works either with a normal Linux kernel or a PREEMPT_RT kernel.

Achieving this is a question of pinning the real-time threads to one CPU and the interrupt handlers to a different one. You can set the CPU affinity of a thread or process using the taskset command-line tool, or you can use the sched_setaffinity(2) and pthread_setaffinity_np(3) functions.

To set the affinity of an interrupt, first note that there is a subdirectory for each interrupt number in /proc/irq/<IRQ number>. The control files for the interrupt are in there, including a CPU mask in smp_affinity. Write a bitmask...

Measuring scheduling latencies

All the configuration and tuning you may do will be pointless if you cannot show that your device meets the deadlines. You will need your own benchmarks for the final testing, but I will describe here two important measurement tools: cyclictest and Ftrace.

cyclictest

cyclictest was originally written by Thomas Gleixner and is now available on most platforms in a package named rt-tests. If you are using the Yocto Project, you can create a target image that includes rt-tests by building the real-time image recipe like this:

$ bitbake core-image-rt

If you are using Buildroot, you need to add the BR2_PACKAGE_RT_TESTS package in the Target packages | Debugging, profiling and benchmark | rt-tests menu.

cyclictest measures scheduling latencies by comparing the actual time taken for sleeping to the requested time. If there was no latency, they would be the same, and the reported latency would be 0. cyclictest assumes a timer resolution of less...

Summary

The term real-time is meaningless unless you qualify it with a deadline and an acceptable miss rate. When you have these two pieces of information, you can determine whether or not Linux is a suitable candidate for the operating system and, if so, begin to tune your system to meet the requirements. Tuning Linux and your application to handle real-time events means making it more deterministic so that the real-time threads can meet their deadlines reliably. Determinism usually comes at the price of total throughput, so a real-time system is not going to be able to process as much data as a non-real-time system.

It is not possible to provide mathematical proof that a complex operating system such as Linux will always meet a given deadline, so the only approach is through extensive testing using tools such as cyclictest and Ftrace and, more importantly, using your own benchmarks for your own application.

To improve determinism, you need to consider both the application and...

Further reading

The following resources have further information about the topics introduced in
this chapter:

  • Hard Real-Time Computing Systems: Predictable Scheduling Algorithms and Applications by Giorgio Buttazzo
  • Multicore Application Programming by Darryl Gove

Why subscribe?

  • Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
  • Improve your learning with Skill Plans built especially for you
  • Get a free eBook or video every month
  • Fully searchable for easy access to vital information
  • Copy and paste, print, and bookmark content

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at customercare@packtpub.com for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up
for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Mastering Embedded Linux Programming - Third Edition
Published in: May 2021Publisher: PacktISBN-13: 9781789530384
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Authors (2)

author image
Frank Vasquez

Frank Vasquez is an independent software consultant specializing in consumer electronics. He has over a decade of experience designing and building embedded Linux systems. During that time, he has shipped numerous devices including a rackmount DSP audio server, a diver-held sonar camcorder, and a consumer IoT hotspot. Before his career as an embedded Linux engineer, Frank was a database kernel developer at IBM where he worked on DB2. He lives in Silicon Valley.
Read more about Frank Vasquez

author image
Chris Simmonds

Chris Simmonds is a software consultant and trainer living in southern England. He has almost two decades of experience in designing and building open-source embedded systems. He is the founder and chief consultant at 2net Ltd, which provides professional training and mentoring services in embedded Linux, Linux device drivers, and Android platform development. He has trained engineers at many of the biggest companies in the embedded world, including ARM, Qualcomm, Intel, Ericsson, and General Dynamics. He is a frequent presenter at open source and embedded conferences, including the Embedded Linux Conference and Embedded World.
Read more about Chris Simmonds