Learn LLVM 12

By Kai Nacke
    Advance your knowledge in tech with a Packt subscription

  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Chapter 1: Installing LLVM

About this book

LLVM was built to bridge the gap between compiler textbooks and actual compiler development. It provides a modular codebase and advanced tools which help developers to build compilers easily. This book provides a practical introduction to LLVM, gradually helping you navigate through complex scenarios with ease when it comes to building and working with compilers.

You’ll start by configuring, building, and installing LLVM libraries, tools, and external projects. Next, the book will introduce you to LLVM design and how it works in practice during each LLVM compiler stage: frontend, optimizer, and backend. Using a subset of a real programming language as an example, you will then learn how to develop a frontend and generate LLVM IR, hand it over to the optimization pipeline, and generate machine code from it. Later chapters will show you how to extend LLVM with a new pass and how instruction selection in LLVM works. You’ll also focus on Just-in-Time compilation issues and the current state of JIT-compilation support that LLVM provides, before finally going on to understand how to develop a new backend for LLVM.

By the end of this LLVM book, you will have gained real-world experience in working with the LLVM compiler development framework with the help of hands-on examples and source code snippets.

Publication date:
May 2021


Chapter 2: Touring the LLVM Source

The LLVM mono repository contains all the projects under the llvm-project root directory. All projects follow a common source layout. To use LLVM effectively, it is good to know what is available and where to find it. In this chapter, you will learn about the following:

  • The contents of the LLVM mono repository, covering the most important top-level projects
  • The layout of an LLVM project, showing the common source layout used by all projects
  • How to create your own projects using LLVM libraries, covering all the ways you can use LLVM in your own projects
  • How to target a different CPU architecture, showing the steps required to cross-compile to another system

Technical requirements

The code files for the chapter are available at https://github.com/PacktPublishing/Learn-LLVM-12/tree/master/Chapter02/tinylang

You can find the code in action videos at https://bit.ly/3nllhED


Contents of the LLVM mono repository

In Chapter 1, Installing LLVM, you cloned the LLVM mono repository. This repository contains all LLVM top-level projects. They can be grouped as follows:

  • LLVM core libraries and additions
  • Compilers and tools
  • Runtime libraries

In the next sections, we will take a closer look at these groups.

LLVM core libraries and additions

The LLVM core libraries are in the llvm directory. This project provides a set of libraries with optimizers and code generation for well-known CPUs. It also provides tools based on these libraries. The LLVM static compiler llc takes a file written in LLVM intermediate representation (IR) as input and compiles it into either bitcode, assembler output, or a binary object file. Tools such as llvm-objdump and llvm-dwarfdump let you inspect object files, and those such as llvm-ar let you create an archive file from a set of object files. It also includes tools that help with the development of LLVM itself...


Layout of an LLVM project

All LLVM projects follow the same idea of directory layout. To understand the idea, let's compare LLVM with GCC, the GNU Compiler Collection. GCC has provided mature compilers for decades for almost every system you can imagine. But, except for the compilers, there are no tools that take advantage of the code. The reason is that it is not designed for reuse. This is different with LLVM.

Every functionality has a clearly defined API and is put in a library of its own. The clang project has (among others) a library to lex a C/C++ source file into a token stream. The parser library turns this token stream into an abstract syntax tree (also backed by a library). Semantic analysis, code generation, and even the compiler driver are provided as a library. The well-known clang tool is only a small application linked against these libraries.

The advantage is obvious: when you want to build a tool that requires the abstract syntax tree (AST) of a C++ file...


Creating your own project using LLVM libraries

Based on the information in the previous section, you can now create your own project using LLVM libraries. The following sections introduce a small language called Tiny. The project will be called tinylang. Here the structure for such a project is defined. Even though the tool in this section is only a Hello, world application, its structure has all the parts required for a real-world compiler.

Creating the directory structure

The first question is if the tinylang project should be built together with LLVM (like clang), or if it should be a standalone project that just uses the LLVM libraries. In the former case, it is also necessary to decide where to create the project.

Let's first assume that tinylang should be built together with LLVM. There are different options for where to place the project. The first solution is to create a subdirectory for the project inside the llvm-projects directory. All projects in this directory...


Targeting a different CPU architecture

Today, many small computers such as the Raspberry Pi are in use and have only limited resources. Running a compiler on such a computer is often not possible or takes too much runtime. Hence, a common requirement for a compiler is to generate code for a different CPU architecture. The whole process of creating an executable is called cross-compiling. In the previous section, you created a small example application based on the LLVM libraries. Now we will take this application and compile it for a different target.

With cross-compiling, there are two systems involved: the compiler runs on the host system and produces code for the target system. To denote the systems, the so-called triple is used. This is a configuration string that usually consists of the CPU architecture, the vendor, and the operating system. More information about the environment is often added. For example, the triple x86_64-pc-win32 is used for a Windows system running on...



In this chapter, you learned about the projects that are part of the LLVM repository and the common layout used. You replicated this structure for your own small application, laying the foundation for more complex applications. As the supreme discipline of compiler construction, you also learned how to cross-compile your application for another target architecture.

In the next chapter, the sample language tinylang will be outlined. You will learn about the tasks a compiler has to do and where LLVM library support is available.

About the Author

  • Kai Nacke

    Kai Nacke is a professional IT architect currently living in Toronto, Canada. He holds a diploma in computer science from the Technical University of Dortmund, Germany. His diploma thesis about universal hash functions was recognized as the best of the semester.

    He has been working in the IT industry for more than 20 years and has great experience in the development and architecture of business and enterprise applications. In his current role, he evolves an LLVM/Clang-based compiler.

    For some years, he was the maintainer of LDC, the LLVM-based D compiler. He is the author of D Web Development, published by Packt. In the past, he was also a speaker in the LLVM developer room at the FOSDEM.

    Browse publications by this author
Book Title
Unlock this book and the full library for FREE
Start free trial