Hands-On System Programming with C++

By Dr. Rian Quinn
    What do you get with a Packt Subscription?

  • Instant access to this title and 7,500+ eBooks & Videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Free Chapter
    Getting Started with System Programming
About this book
C++ is a general-purpose programming language with a bias toward system programming as it provides ready access to hardware-level resources, efficient compilation, and a versatile approach to higher-level abstractions. This book will help you understand the benefits of system programming with C++17. You will gain a firm understanding of various C, C++, and POSIX standards, as well as their respective system types for both C++ and POSIX. After a brief refresher on C++, Resource Acquisition Is Initialization (RAII), and the new C++ Guideline Support Library (GSL), you will learn to program Linux and Unix systems along with process management. As you progress through the chapters, you will become acquainted with C++'s support for IO. You will then study various memory management methods, including a chapter on allocators and how they benefit system programming. You will also explore how to program file input and output and learn about POSIX sockets. This book will help you get to grips with safely setting up a UDP and TCP server/client. Finally, you will be guided through Unix time interfaces, multithreading, and error handling with C++ exceptions. By the end of this book, you will be comfortable with using C++ to program high-quality systems.
Publication date:
December 2018


Getting Started with System Programming

In this chapter, we will discuss what system programming is (that is, the act of making system calls to the operating system to perform an action on your behalf), and go into the pros and cons of both system programming, and system programming with C++.

In this chapter, we will review the following:

  • System calls, including what they are, how to execute them, and the potential security risks associated with them
  • The benefits of using C++ when system programming

Technical requirements

In order to follow the examples in this chapter, the reader must have:

  • A Linux-based system capable of compiling and executing C++17 (for example, Ubuntu 17.10+)
  • GCC 7+
  • CMake 3.6+
  • An internet connection


Understanding system calls

An operating system is a piece of software designed to execute one or more applications simultaneously, while also providing the resources needed for those applications to execute. To accomplish this, the operating system must be capable of dividing hardware resources between all the applications executing on the system at the same time.

For example, most personal computers (PCs) have a single hard disk that stores all the files being used by the owner of the PC. On modern PCs, it's likely the user will want to execute several applications at oncefor example, a web browser and an office suite.

Both of these applications will need exclusive access to the hard disk at various times while executing. In the case of the web browser, this might be to cache websites to disk, while in the case of the office suite, this might be to store documents.

It's the operating system's responsibility to manage the applications and their access to the hard disk, to ensure that both the web browser and the office suite are able to execute properly.

To accomplish this, operating systems provide an application programming interface (API) that applications can leverage to accomplish their tasks. Accessing the hard disk is an example of one of these tasks. The read() and write() functions are examples of APIs provided by POSIX-compliant operating systems for reading from and writing data to file descriptors.

Under the hood, these APIs make calls to the operating system using an application binary interface (ABI) called a system call. The act of making system calls to accomplish tasks provided by the operating system is called system programming, which is the main focus of this book.

The anatomy of a system call

For the purposes of this section, we will focus our examples on the Intel x86 architecture, although these examples apply to most other CPU architectures.

The original x86 architecture leveraged interrupts to provide system call ABIs. The APIs provided by the operating system would program specific registers on the CPU, and make a call to the operating system using an interrupt.

For example, using BIOS, an application could read data from a hard disk using int 0x13 with the following register layout:

  • AH = 2
  • AL: Sectors to read
  • CH: Cylinder
  • CL: Sector
  • DH: Head
  • DL: Drive
  • ES:BX: Buffer address

The application author would use the read() API command to read this data, while under the hood, read() would perform the system call using the preceding ABI. When int 0x13 executed, the application would be paused by the hardware, and the operating system (in this case, BIOS) would execute on behalf of the application to read data from the disk and return the result in the buffer provided by the application.

Once complete, BIOS would execute iret (interrupt return) to return to the application, which would then have the data read from disk waiting in its buffer to be used.

With this approach, the application doesn't need to know how to physically interface with the hard disk on that specific computer in order to read data; a task that is meant to be handled by the operating system and its device drivers.

The application doesn't have to worry about other applications that may be executing either. It can simply leverage the provided API (or ABI, depending on the operating system), and the rest of the gory details are handled by the operating system.

In other words, system calls provide a clean delineation between applications, to help the user accomplish specific tasks, and to help the operating system whose job it is to manage these applications and the hardware resources they require.

Interrupts are, however, slow. The hardware makes no assumptions about how the operating system is written, or how the applications the operating system is executing are written or organized. For this reason, interrupts must save the CPU state before the interrupt handler is executed, and restore this state when the iret command is executed, leading to poor performance.

As will be shown, applications make a lot of system calls when attempting to perform their job, and this poor performance became a bottleneck on x86 architectures (as well as other CPU architectures).

To solve this issue, modern versions of Intel x86 CPU provided fast system call instructions. These instructions were designed specifically to address the performance bottleneck of interrupt-driven system calls. However, they require coordination between the CPU, the operating system, and the applications executing on that operating system to reduce overhead.

Specifically, the operating system must structure the memory layout of itself and the applications it's running in a specific way, dictated by the CPU. By predefining the memory layout of the operating system and its associated applications, the CPU no longer needs to save and restore as much CPU state when performing a system call, reducing overhead. How this is accomplished is different depending on whether you're executing on an Intel or AMD x86 CPU.

The most important thing to understand with respect to how a system call is performed is that a system call is not cheap. Even with fast system call support, a system call has to perform a lot of work. In the case of reading data from a hard disk via the read() API, the CPU register state must be set up and a system call instruction must be executed. CPU control is handed off to the operating system to read data from the disk.

Since more than one application might be executing, and attempting to read data from the disk at the same time, the operating system might have to pause the application so that it can service another.

Once the operating system is ready to service the application, it must first figure out what data the application is attempting to read, which ultimately determines which physical device it needs to work with. In our example, this is a hard disk, but on a POSIX-compliant system it could be any type of block device.

Next, the operating system must leverage one of its device drivers to read data from this disk. This takes time, as the operating system has to physically program the hard disk to ask for data from a specific location, over a hardware bus that almost certainly is not executing at the same speed as the CPU itself.

Once the hard disk finally provides the operating system with the requested data, the operating system can provide this information back to the application and return control, restoring the CPU state to the application. All of this insanity is obscured by a single call to read().

For this reason, system calls should be executed sparingly, and only when absolutely needed, to prevent the poor performance of the resulting application.

It should be noted that this type of optimization requires a deep understanding of the APIs the application leverages, as higher-level APIs make their own system calls on the API's behalf. For example, allocating memory, as will be discussed later, is another type of system call.

For example, look at the difference between using an std::array{} or a std::vector{} command. std::vector{} supports resizing of the array being managed under the hood, which requires memory allocation. This can not only lead to memory fragmentation (a topic that will be discussed later on in this book), but also poor performance, as the memory allocation might have to ask the operating system for more system RAM.

Learning about different types of system calls

Almost every application that executes on a POSIX-compliant operating system must make a couple of system calls. Here, we outline some of the system call types that will be explored in this book.

Console input/output

If you have ever executed a command-line application, you willbe familiar with the concept of console-based input/output. This is especially true with respect to POSIX-compliant operating systems. When outputting to the console, you can either output to stdout (typically used for normal output) or stderr (typically used for outputting error messages).

Outputting to stdout and stderr is accomplished by an application performing a system that asks the operating system to deliver a character buffer to these output devices. (It should be noted that, in this book, we typically state that we are outputting to stdout, not printing to the console.)

The reason for this is that, on POSIX-compliant systems, your application doesn't actually know where it is sending the text to. The application leverages an API to output to stdout. This can be accomplished by:

  • Writing to a dedicated file handle (that is, stdout)
  • Using C APIs such as printf
  • Using C++ APIs such as std::cout
  • Forking an application that outputs to stdout for you (for example, by using echo)

Most of these examples, when all is said and done, make a system call to the operating system to transfer a character buffer to a device that manages stdout or stderr. In some cases, this causes the operating system to relay the resulting character buffer to the parent process (likely your shell), which will ultimately make another system call to display the character buffer on the screen.

However your operating system decides to handle this, a device driver exists in the operating system that manages the physical monitor used to display text, and the simple APIs the application calls to output text (for example, printf and std::cout) eventually provide this device driver with the requested character buffer.

Although, on most systems, the text being output to stdout is usually provided to your shell and eventually displayed on the screen, this doesn't have to be the case. Since the application is making a system call to output the character buffer, the operating system is free to forward this data to a serial device, log file, as input to another application, and so on.

This flexibility is one of the reasons POSIX-compliant operating systems are so powerful, and why learning how to properly make system calls is so important.

Memory allocation

Memory is another resource that an application must request using a system call. Most applications are given global and stack memory resources when the application is first executed, along with a small heap of memory that the application can use when calls to functions such as malloc() and free() are made.

If the application only uses the memory that it is initially given in this heap, no extra memory needs to be requested by the application. If, however, heap memory runs out, the application's malloc() or free() engine will have to ask the operating system (via a system call) for more memory.

To do this, the operating system will extend the end of the application by adding more physical memory to the application. The malloc() or free() engine is then able to make use of this additional memory, until more is needed.

On systems with limited RAM, when a request for additional memory is made, the operating system has to take memory from other applications that aren't currently executing. It does this by swapping these applications to disk, an operation that is expensive to perform.

For this reason, on resource-constrained systems, calls to malloc() or free() should not be made in time-critical code, as the time it takes to execute these functions can vary greatly.

We will go into further detail on memory management in Chapter 7, A Comprehensive Look at Memory Management.

File input/output

Reading and writing to a file is another common use case for most applications that requires making system calls.

It should be noted that on POSIX-compliant systems, reading and writing to a file descriptor doesn't always mean reading and writing to a file on a storage device. Instead, the system calls you make write to character or block devices. This could be a storage device, but could also be a console device, or even a virtual device such as /dev/random, which provides random data when read.

In Chapter 8, Learning to Program File Input/Output, we will provide more information about file input/output system programming.


Networking is another common use case that requires making system calls. On POSIX-compliant systems, we perform network-based system programming by working with POSIX sockets. Sockets provide an API for programming the Network Interface Controller (NIC), and support logic (for example, the TCP/IP stack) within the operating system.

Networking itself is an extremely complicated topic, deserving of its own book, but thankfully, the system calls needed to perform this type of programming are simple, with the majority of the gory details being handled by the operating system.

In Chapter 10, Programming POSIX Sockets Using C++, we will go into further detail on how to make these types of system calls using the socket API.


Some readers might find it surprising to know that even performing simple tasks such as getting the current date and time require system calls to ask the operating system for this information. Even to this day, a dedicated chip (with a battery, in case of loss of power) is provided on the system to maintain the current date and time.

If this information is needed, a system call must be made to request it. When this happens, the operating system will ask the device driver responsible for managing the chip what date and time it is currently storing, and then this information will be returned to the application.

It should be noted that not all time interfaces require system calls. For example, most high-resolution timers, which are designed to compare a high-resolution number before and after an operation has taken place, do not need the operating system to perform this action. This is because these high-resolution timers usually exist directly in the CPU, and their values can be extracted using a simple instruction.

The downside to these types of timers is that their values in and of themselves are usually meaningless (that is, the difference between the values returned is what provides meaning, not the values themselves). Essentially, these timers are usually nothing more than a counter that increments each time the CPU ticks (that is, executes an instruction).

Since modern CPUs can dynamically change their frequency, the values these counters store depends on how long the CPU has executed since the previous power cycle, and at what frequency the CPU was set while it was executing.

There isn't even a guarantee that the value in one counter will be the same as the value read in another counter on another physical core, as each physical core is capable of changing its own frequency independently of other cores on multi-core CPUs.

The benefit of high-resolution timers is that they can be executed extremely quickly (as you are just executing an instruction that reads a counter in the CPU). The difference between two measured values can be used to carry out tasks such as measuring how long it takes to execute small functionsa task that usually doesn't work with standard timers, as they don't have enough granularity.

In Chapter 11, Time Interfaces in Unix, we will go over these details and even provide an example of how to do this yourself.

Threading and process creation

Executing multiple tasks simultaneously can be accomplished by asking the operating system to create additional threads (or even new processes). This is a common task in system programming, and there are numerous system calls to get the job done.

A process is a unit of execution that has a set of resources assigned to it (for example, memory, file descriptors, and so on.) Each application is made up of at least one process, but they can contain more than one (for example, a shell is an application that is specifically designed to run several child processes).

Each process is scheduled by the operating system to execute for a limited amount of time before the next process is given access to the CPU, and this cycle continues as needed.

Threads are like processes, but they share the same resources as other threads of the same process. Threads provide an application with an opportunity to create tasks that are capable of executing in parallel, without the need for inter-process communication methods. In Chapter 12, Learning to Program POSIX and C++ Threads, we will learn how to program threads using both POSIX and C++ APIs.

System call security risks

System calls are not without their security risks. Even on modern hardware, and using CPU architectures other than Intel, executing more than one process within an operating system with full isolation between processes is nearly impossible.

Although modern hardware and modern operating systems work hard to provide the best possible isolation and security, it should always be assumed that other, malicious processes executing alongside yours may be able to spy on what you're doing, including sensitive tasks such as decrypting user data.

This is another topic that deserves its own book, but here, we will briefly discuss two different, recent security vulnerabilities that affect system programming.


The fast system call interface provided by Intel and AMD was not without its issues. As stated previously, for fast system calls to work, the hardware, operating system, and applications must coordinate. This is to ensure that ABI information is handled properly, to allow the operating system to execute a system call without the need for the hardware to save the entire CPU state before execution begins.

The same applies when the system call is complete, and control must be handed back to the application. To accomplish this, the operating system must load the application's stack, and then execute the SYSRET instruction, which returns control to the application.

The problem with this approach is that a non-maskable interrupt (NMI) could fire between the operating system loading the application's stack and the execution of SYSRET. The result of this race condition is that an NMI (which is code that executes with root privileges) would be executed using the application's stack and not the kernel's stack, resulting in a possible security vulnerability or corruption.

Thankfully, there are ways for modern operating systems to prevent this type of attack, which most operating systems, such as Linux, can and do leverage.

Meltdown and Spectre

The Meltdown and Spectre attacks are a modern examples of just how complicated system calls are to implement. To support the fast execution of system calls, the kernel's memory is mapped into each application using a memory layout technical called the 3:1 split, which refers to the three-to-one ratio of application memory to kernel memory.

To prevent an application from reading/writing kernel memory, which may or may not contain highly-sensitive information such as encryption keys and passwords, modern CPU architectures provide a mechanism to lock down the kernel portion of this memory, such that only the kernel is capable of seeing it all. The application is only able to see its deprivileged portion of that memory.

To improve the performance of these modern CPUs, most architectures, including Intel, AMD, and ARM, incorporate a technology called speculative execution. For example, look at the following code:

if (x) {


The CPU doesn't know whether x is true or false until it executes this instruction. If the CPU assumes that x is true, it can enhance performance by saving some CPU cycles. If x does, in fact, end up being true, the CPU saves cycles, whereas if x is actually false, the penalty is usually worth the risk, especially if the CPU can make an educated guess as to the likelihood of x being true instead of false (for example, if the CPU executed this statement in the past and x was true).

This type of optimization is called speculative execution. The CPU is executing code, even though it's possible the code may later turn out to be invalid and need to be undone.

Speculative execution attacks such as Meltdown and Spectre exploit this process to bypass the memory protections that protect the system call interface between an application and its kernel. This is done by convincing the CPU to speculatively execute an instruction that would typically cause a security violation (for example, attempting to read a password from kernel memory).

If the CPU speculatively executes this type of instruction, there will be a gap between the CPU loading the password into the CPU's cache, and the CPU figuring out that a security violation has occurred. If the CPU is interrupted during this gap (using what is called a transient instruction), the password will be left in the CPU's cache, even though the instruction never actually completed its execution.

To recover the password from the cache, attackers leverage additional attacks on the CPU called side-channel attacks, which are specifically designed to read the contents of a CPU's cache without performing a direct memory operation.

The end result is that an attacker is capable of setting up an elaborate set of conditions that will eventually allow them to recover sensitive information stored in the kernel, using nothing more than an unprivileged application (which could be a website you happened to click on while looking for cat videos).

If this seems complicated, that's because it is. These types of attacks are extremely sophisticated. The goal of these examples is to provide a brief overview of why system calls are not without their issues. Depending on the CPU and operating system you're executing on, you might have to take special care when handling sensitive information while system programming.


Benefits of using C++ when system programming

Although the focus of this book is on system programming and not C++, and we do provide a lot of examples in C, there are several benefits to system programming in C++ compared to standard C.

Note that this section assumes some general knowledge of C++. A more complete explanation of the C++ standard will be provided in Chapter 2, Learning the C, C++17, and POSIX Standards.

Type safety in C++

Standard C is not a type-safe language. Type safety refers to protections put in place to prevent one type from being confused with another type. Some languages, such as ADA, are extremely type-safe, providing so many protections that the language, at times, can be frustrating to work with.

Conversely, languages such as C are so type-unsafe that hard-to-find type errors occur frequently, and often lead to instability.

C++ provides a compromise between the two approaches, encouraging reasonable type safety by default, while providing mechanisms to circumvent this when needed.

For example, consider the following code:

/* Example: C */
int *p = malloc(sizeof(int));

// Example: C++
auto p = new int;

Allocating an integer on the heap in C requires the use of malloc(), which returns void *. There are several issues with this code that are addressed in C++:

  • C automatically converts the void * type to int *, meaning that an implicit type conversion has occurred even though there is no connection between the type the user stated and the type returned. The user could easily allocate short (which is not the same thing as int, a topic we will discuss in Chapter 3, System Types for C and C++). The type conversion would still be applied, meaning that the compiler would not have the proper context to detect that the allocation was not large enough for the type the user was attempting to allocate.
  • The size of the allocation must be stated by the programmer. Unlike C++, C has no understanding of the type that is being allocated. Thus, it is unaware of the size of the type, and so the programmer must explicitly state this. The problem with this approach is that hard-to-find allocation bugs can be introduced. Often, the type that is provided to sizeof() is incorrect (for example, the programmer might provide a pointer instead of the type itself, or the programmer might change the code later on, but forget to change the value being provided to sizeof()). As stated previously, there is no connection between what malloc() allocates and returns, and the type the user attempts to allocate, providing an opportunity to introduce a hard-to-find logic error.
  • The type must be explicitly stated twice. malloc() returns void *, but C implicitly converts to whatever pointer type the user stateswhich means a type has been declared twice (in this case, void * and int *). In C++, the use of auto means that the type is only declared once (in this case, int states the type is an int *), and auto will take on whatever type is returned. The use of auto and the removal of implicit type conversions means whatever type is declared in the allocation is what the p variable will take on. If the code after this allocation expects a different type to the one p takes on, the compiler will know about it at compile time in C++, while a bug like this would likely not be caught in C until runtime, when the program crashes (we hope this code is not controlling an airplane!).

In addition to the preceding example of the dangers of implicit type casting, C++ also provides run-time type information (RTTI). This information has many uses, but the most important use case involves the dynamic_cast<> operator, which performs runtime type checking.

Specifically, converting from one type to another can be checked during runtime, to ensure a type error doesn't occur. This is often seen when performing the following:

  • Polymorphic type conversions: In C, polymorphism is possible, but it must be done manually, a pattern that is seen often in kernel programming. C, however, doesn't have the ability to determine whether a pointer was allocated for a base type or not, resulting in the potential for a type error. Conversely, C++ is capable of determining at runtime whether a provided pointer is being cast to the proper type, including when using polymorphism.
  • Exception support: When catching an exception, C++ uses RTTI (essentially dynamic_cast<>), to ensure that the exception being thrown is caught by the proper handler.

Objects of C++

Although C++ supports object-oriented programming with built-in constructs, object-oriented programming is a design pattern that is often used in C as well, and in POSIX in general. Take the following example:

/* Example: C */

struct point
int x;
int y;

void translate(point *p; int val)
if (p == NULL) {

p->x += val;
p->y += val;

In the preceding example, we have a struct that stores a point{}, which contains x and y positions. We then offer a function that is capable of translating this point{} in both the x and y positions, using a given value (that is, a diagonal translation).

There are a couple of notes with respect to this example:

  • Often, people will claim to dislike object-oriented programming, but then you see this sort of thing in their code, which is, in fact, an object-oriented design. The use of class isn't the only way to create an object-oriented design. The difference with C++ is that the language provides additional constructs for cleanly and safely working with objects, while with C this same functionality must be done by handa process that is prone to error.
  • The translate() function is only related to the point{} object because it takes a point{} as a parameter. As a result, the compiler has no contextual information to understand how to manipulate a point{} struct, without translate() being given a pointer to it as a parameter. This means that every single public-facing function that wishes to manipulate a point{} struct must take a pointer to it as its first parameter, and verify that the pointer is valid. Not only is this a clunky interface, it's slow.

In C++, the preceding example can be written as the following:

// Example: C++

struct point
int x;
int y;

void translate(int val)
p->x += val;
p->y += val;

In this example, a struct is still used. The only difference between a class and a struct in C++ is that all variables and functions are public by default with a struct, while they are private by default with a class.

The difference is that the translate() function is a member of point{}, which means it has access to the contents of its structure, and so no pointers are needed to perform the translation. As a result, this code is safer, more deterministic, and easier to reason about, as there is never the fear of a null dereference.

Finally, objects in C++ provide construction and destruction routines that help prevent objects from not being properly initialized or properly deconstructed. Take the following example:

// Example: C++

struct myfile
int fd{0};

~myfile() {

In the preceding example, we create a custom file object that holds a file descriptor, often seen and used when system programming with POSIX APIs.

In C, the programmer would have to remember to manually set the file descriptor to 0 on initialization, and close the file descriptor when it is no longer in scope. In C++, using the preceding example, both of these operations would be done for you any time you use myfile.

This is an example of the use of Resource Acquisition Is Initialization (RAII), a topic that will be discussed in more detail in Chapter 4, C++, RAII, and the GSL Refresher, as this pattern is used a lot by C++. We will leverage this technique when system programming to avoid a lot of common POSIX-style pitfalls.

Templates used in C++

Template programming is often an undervalued, misunderstood addition to C++ that is not given enough credit. Most programmers need to look no further than attempting to create a generic linked list to understand why.

C++ templates provides you with the ability to define your code without having to define type information ahead of time.

One way to create a linked list in C is to use pointers and dynamic memory allocation, as seen in this simple example:

struct node 
void *data;
node next;

void add_data(node *n, void *val);

In the preceding example, we store data in the linked list using void *. An example of how to use this is as follows:

node head;
add_data(&head, malloc(sizeof(int)));
*(int*)head.data = 42;

There are a few issues with this approach:

  • This type of linked list is clearly not type-safe. The use of the data and the data's allocation are completely unrelated, requiring the programmer using this linked list to manage all of this without error.
  • A dynamic memory allocation is needed for both the nodes and the data. As was discussed earlier, memory allocations are slow as they require system calls.
  • In general, this code is hard to read and clunky.

Another way to create a generic linked list is to use macros. There are several implementations of these types of linked lists (and other data structures) floating around on the internet, which provide a generic implementation of a linked list without the need for dynamically allocating data. These macros provide the user with a way to define the data type the linked list will manage at compile time.

The problem with these approaches, other than reliability, is these implementations use macros to implement template programming in a way that is far less elegant. In other words, the solution to adding generic data structures to C is to use C's macro language to manually implement template programming. The programmer would be better off just using C++ templates.

In C++, a data structure like a linked list can be created without having to declare the type the linked list is managing until it is declared, as follows:

template<typename T>
class mylinked_list
struct node
T data;
node *next;




node m_head;

In the preceding example, not only are we able to create a linked list without macros or dynamic allocations (and all the problems that come with the use of void * pointers), but we are also able to encapsulate the functionality, providing a cleaner implementation and user API.

One complaint that is often made about template programming is the amount of code it generates. Most code bloat from templates typically originates as a programming error. For example, a programmer might not realize that integers and unsigned integers are not the same types, resulting in code bloat when templates are used (as a definition for each type is created).

Even aside from that issue, the use of macros would produce the same code bloat. There is no free lunch. If you want to avoid the use of dynamic allocation and type casting while still providing generic algorithms, you have to create an instance of your algorithm for each type you plan to use. If reliability is your goal, allowing the compiler to generate the code needed to ensure your program executes properly outweighs the disadvantages.

Functional programming associated with C++

Functional programming is another addition to C++ that provides the user with compiler assistance, in the form of lambda functions. Currently, this must be carried out by hand in C.

In C, a functional programming construct can be achieved using a callback. For example, consider the following code:

guard(void (*ptr)(int *val), int *val)

inc(int *val)

dec(int *val)

int count = 0;
guard(inc, &count);
guard(dec, &count);

In the preceding code example, we create a guard function that locks a mutex, calls a function that operates on a value, and then unlocks the mutex on exit. We then create two functions, one that increments a value given to it, and one that decrements a value given to it. Finally, we create a function that instantiates a count, and then increments the count and decrements the count using the guard function.

There are a couple of issues with this code:

  • The first issue is the need for pointer logic to ensure we can manipulate the variable we wish to operate on. We are also required to manually pass this pointer around to keep track of it. This makes the APIs clunky, as we have a lot of extra code that we have to write manually for such a simple example.
  • The function signature of the helper functions is static. The guard function is a simple one. It locks a mutex, calls a function, and then unlocks it. The problem is that, since the parameters of the function must be known while writing the code instead of at compile time, we cannot reuse this function for other tasks. We will need to hand-write the same function for each function signature type we plan to support.

The same example can be written using C++ as follows:

template<typename FUNC>
guard(FUNC f)

int count = 0;
guard(inc, [&]{ count++ });
guard(inc, [&]{ count-- });

In the preceding example, the same functionality is provided, but without the need for pointers. In addition, the guard function is generic and can be used for more than one case. This is accomplished by leveraging both template programming and functional programming.

The lambda provides the callback, but the parameters of the callback are encoded into the lambda's function signature, which is absorbed by the use of a template function. The compiler is capable of generating a version of the guard function for use that takes the parameters (in this case, a reference to the count variable) and storing it in the code itself, removing the need for users to do this by hand.

The preceding example will be used a lot in this book, especially when creating benchmarking examples, as this pattern gives you the ability to wrap functionality in code designed to time the execution of your callback.

Error handling mechanism in C++

Error handling is another issue with C. The problem, at least until set jump exceptions were added, was that the only ways to get an error code from a function were as follows:

  • Constrain the output of a function, so that certain output values from the function could be considered an error
  • Get the function to return a structure, and then manually parse that structure

For example, consider the following code:

struct myoutput 
int val;
int error_code;

struct myoutput myfunc(int val)
struct myoutput = {0};

if (val == 42) {
myoutput.error_code = -1;

myoutput.val = val;
return myoutput;

struct myoutput = myfunc(42);

if (myoutput.error_code == -1) {

The preceding example provides a simple mechanism for outputting an error from a function without having to constrain the output of the function (for example, by assuming that -1 is always an error).

In C++, this can be implemented using the following C++17 logic:

std::pair<int, int>
myfunc(int val)
if (val == 42) {
return {0, -1};

return {val, 0};

if (auto [val, error_code] = myfunc(42); error_code == -1) {

In the preceding example, we were able to remove the need for a dedicated structure by leveraging std::pair{}, and we were able to remove the need to work with std::pair{} by leveraging an initializer_list{} and C++17-structured bindings.

There is, however, an even easier method for handling errors without the need for checking the output of every function you execute, and that is to use exceptions. C provides exceptions through the set jump API, while C++ provides C++ exception support. Both of these will be discussed at length in Chapter 13, Error - Handling with Exceptions.

APIs and C++ containers in C++

As well as the language primitives that C++ provides, it also comes with a Standard Template Library (STL) and associated APIs that greatly aid system programming. A good portion of this book will focus on these APIs, and how they support system programming.

It should be noted that the focus of this book is system programming and not C++, and for this reason, we do not cover C++ containers in any detail, but instead assume the reader has some general knowledge of what they are and how they work. With that said, C++ containers support system programming by preventing the user from having to re-write them manually.

We teach students how to write their own data structures, not so that when they need a data structure they know how to write one, but instead so that, when they need one, they know which data structure to use and why. C++ already provides most, if not all, of the data structures you might need when system programming.



In this chapter, we learned what system programming is. We covered the general anatomy of a system call, different types of system calls, and some recent security issues with system calls.

In addition, we covered the advantages of system programming with C++ instead of strictly using standard C. In the next chapter, we will cover the C, C++, and POSIX standards in detail and how they relate to system programming.



  1. What is system programming?
  2. Prior to fast system calls, how were system calls executed?
  3. What key change was made to support fast system calls?
  4. Does allocating memory always result in a system call?
  5. What type of execution do the Meltdown and Spectre attacks exploit?
  6. What is type safety?
  7. Provide at least one benefit to template programming in C++?
About the Author
  • Dr. Rian Quinn

    Dr. Rian Quinn is the Chief Technology Officer (CTO) in the Advanced Technologies Business Unit at Assured Information Security, Inc., having focused on trusted computing, hypervisor-related technologies, machine learning/artificial intelligence, and cybersecurity for more than 10 years, and has 9 years of technical management and business development experience. He holds a Ph.D. in computer engineering, with specializations in information assurance and computer architectures, from Binghamton University. He is the cofounder and lead developer of the Bareflank hypervisor, and is an active member of several open source projects, including Microsoft's Guideline Support Library (GSL) and OpenXT. Rian previously wrote Hands-On System Programming with C++.

    Browse publications by this author
Hands-On System Programming with C++
Unlock this book and the full library FREE for 7 days
Start now