Reader small image

You're reading from  Modern CMake for C++

Product typeBook
Published inFeb 2022
PublisherPackt
ISBN-139781801070058
Edition1st Edition
Tools
Right arrow
Author (1)
Rafał Świdziński
Rafał Świdziński
author image
Rafał Świdziński

Rafał Świdziński works as a staff engineer at Google. With over 10 years of professional experience as a full stack developer, he has been able to experiment with a vast multitude of programming languages and technologies. During this time, he has been building software under his own company and for corporations including Cisco Meraki, Amazon, and Ericsson. Originally from Łódź, Poland, he now lives in London, UK, from where he runs a YouTube channel, "Smok," discussing topics related to software development. He tackles technical problems, including real-life and work-related challenges encountered by many people in the field. Throughout his work, he explains the technical concepts in detail and demystifies the art and science behind the role of software engineer. His primary focus is on high-quality code and the craftsmanship of programming.
Read more about Rafał Świdziński

Right arrow

Chapter 6: Linking with CMake

You might think that after we have successfully compiled the source code into a binary file, our job as build engineers is done. That's almost the case – binary files contain all the code for a CPU to execute, but the code is scattered across multiple files in a very complex way. Linking is a process that simplifies things and makes machine code neat and quick to consume.

A quick glance at the list of commands will tell you that CMake doesn't provide that many related to linking. Admittedly, target_link_libraries() is the only one that actually configures this step. Why dedicate a whole chapter to a single command then? Unfortunately, almost nothing is ever easy in computer science, and linking is no exception.

To achieve the correct results, we need to follow the whole story – understand how exactly a linker works and get the basics right. We'll talk about the internal structure of object files, how the relocation and...

Technical requirements

You can find the code files that are present in this chapter on GitHub at https://github.com/PacktPublishing/Modern-CMake-for-Cpp/tree/main/examples/chapter06.

To build examples provided in this book always use recommended commands:

cmake -B <build tree> -S <source tree>
cmake --build <build tree>

Be sure to replace placeholders <build tree> and <source tree> with appropriate paths. As a reminder: build tree is the path to target/output directory, source tree is the path at which your source code is located.

Getting the basics of linking right

We discussed the life cycle of a C++ program in Chapter 5, Compiling C++ Sources with CMake. It consists of five main stages – writing, compiling, linking, loading, and execution. After correctly compiling all the sources, we need to put them together into an executable. Object files produced in a compilation can't be executed by a processor directly. But why?

To answer this, let's take a look at how a compiler structures an object file in the popular ELF format (used by Unix-like systems and many others):

Figure 6.1 – The structure of an object file

The compiler will prepare an object file for every unit of translation (for every .cpp file). These files will be used to build an in-memory image of our program. Object files contain the following elements:

  • An ELF header identifying the target operating system, ELF file type, target instruction set architecture, and information on the position...

Building different library types

After source code is compiled, we might want to avoid compiling it again for the same platform or even share it with external projects wherever possible. Of course, you could just simply provide all of your object files as they were originally created, but that has a few downsides. It is harder to distribute multiple files and add them individually to a buildsystem. It can be a hassle, especially if they are numerous. Instead, we could simply bring all object files into a single object and share that. CMake helps greatly with this process. We can create these libraries with a simple add_library() command (which is consumed with the target_link_libraries() command). By convention, all libraries have a common prefix, lib, and use system-specific extensions that denote what kind of library they are:

  • A static library has a .a extension on Unix-like systems and .lib on Windows.
  • Shared libraries have a .so extension on Unix-like systems and .dll...

Solving problems with the One Definition Rule

Phil Karlton was right on point when he said the following:

"There are two hard things in computer science: cache invalidation and naming things."

Names are difficult for a few reasons – they have to be precise, simple, short, and expressive at the same time. That makes them meaningful and allows programmers to understand the concepts behind the raw implementation. C++ and many other languages impose one more requirement – many names have to be unique.

This is manifested in a few different ways. A programmer is required to follow the ODR. This says that in the scope of a single translation unit (a single .cpp file), you are required to define it exactly once, even if you declare the same name (of a variable, function, class type, enumeration, concept, or template) multiple times.

This rule is extended to the scope of an entire program for all variables you effectively use in your code and non-inlined...

The order of linking and unresolved symbols

A linker can often seem whimsical and start complaining about things for no apparent reason. This is an especially difficult ordeal for programmers starting out who don't know their way around this tool. It's no wonder, since they usually try to avoid touching build configuration for as long as they possibly can. Eventually, they're forced to change something (perhaps add a library they worked on) in the executable, and all hell breaks loose.

Let's consider a fairly simple dependency chain – the main executable depends on the outer library, which depends on the nested library (containing the necessary int b variable). Suddenly, an inconspicuous message appears on the programmer's screen:

outer.cpp:(.text+0x1f): undefined reference to 'b'

This isn't such a rare diagnostic – usually, it means that we forgot to add a necessary library to the linker. But in this case, the library...

Separating main() for testing

As we established so far, a linker enforces the ODR and makes sure that all external symbols provide their definitions in the process of linking. One interesting problem that we might encounter is the correct testing of the build.

Ideally, we should test exactly the same source code that is being run in production. An exhaustive testing pipeline should build the source code, run its tests on produced binary, and only then package and distribute the executable (without the tests themselves).

But how do we actually make this happen? Executables have a very specific flow of execution, which often requires reading command-line arguments. C++'s compiled nature doesn't really support pluggable units that can be temporarily injected into the binary for test purposes only. It seems like we'll need a very complex approach to solve this.

Luckily, we can use a linker to help us deal with this in an elegant manner. Consider extracting all logic...

Summary

Linking in CMake does seem simple and insignificant, but in reality, there's much more to it than meets the eye. After all, linking executables isn't as simple as putting puzzle pieces together. As we learned about the structure of object files and libraries, we discovered that things need to move around a bit before a program is runnable. These things are called sections and they have distinct roles in the life cycle of the program – store different kinds of data, instructions, symbol names, and so on. A linker needs to combine them together in the final binary accordingly. This process is called relocation.

We also need to take care of symbols – resolve references across all the translation units and make sure that nothing's missing. Then, a linker can create the program header and add it to the final executable. It will contain instructions for the system loader, describing how to turn consolidated sections into segments that make up...

Further reading

For more information on the topics covered in this chapter, you can refer to the following:

  • The structure of ELF files:

https://en.wikipedia.org/wiki/Executable_and_Linkable_Format

  • The CMake manual for add_library():

https://cmake.org/cmake/help/latest/command/add_library.html

  • Dependency hell:

https://en.wikipedia.org/wiki/Dependency_hell

  • The differences between modules and shared libraries:

https://stackoverflow.com/questions/4845984/difference-between-modules-and-shared-libraries

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Modern CMake for C++
Published in: Feb 2022Publisher: PacktISBN-13: 9781801070058
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Rafał Świdziński

Rafał Świdziński works as a staff engineer at Google. With over 10 years of professional experience as a full stack developer, he has been able to experiment with a vast multitude of programming languages and technologies. During this time, he has been building software under his own company and for corporations including Cisco Meraki, Amazon, and Ericsson. Originally from Łódź, Poland, he now lives in London, UK, from where he runs a YouTube channel, "Smok," discussing topics related to software development. He tackles technical problems, including real-life and work-related challenges encountered by many people in the field. Throughout his work, he explains the technical concepts in detail and demystifies the art and science behind the role of software engineer. His primary focus is on high-quality code and the craftsmanship of programming.
Read more about Rafał Świdziński