You're reading from Building Low Latency Applications with C++

Product type Book

Published in Jul 2023

Publisher Packt

ISBN-13 9781837639359

Pages 506 pages

Edition 1st Edition

Languages

Concepts

Programming Language

Author (1):

Sourav Ghosh

Table of Contents (19) Chapters

Preface

1. Part 1:Introducing C++ Concepts and Exploring Important Low-Latency Applications

2. Chapter 1: Introducing Low Latency Application Development in C++

3. Chapter 2: Designing Some Common Low Latency Applications in C++

4. Chapter 3: Exploring C++ Concepts from A Low-Latency Application’s Perspective

5. Chapter 4: Building the C++ Building Blocks for Low Latency Applications

6. Part 2:Building a Live Trading Exchange in C++

7. Chapter 5: Designing Our Trading Ecosystem

8. Chapter 6: Building the C++ Matching Engine

9. Chapter 7: Communicating with Market Participants

10. Part 3:Building Real-Time C++ Algorithmic Trading Systems

11. Chapter 8: Processing Market Data and Sending Orders to the Exchange in C++

12. Chapter 9: Building the C++ Trading Algorithm’s Building Blocks

13. Chapter 10: Building the C++ Market Making and Liquidity Taking Algorithms

14. Part 4:Analyzing and Improving Performance

15. Chapter 11: Adding Instrumentation and Measuring Performance

16. Chapter 12: Analyzing and Optimizing the Performance of Our C++ System

17. Index

Why subscribe?

18. Other Books You May Enjoy

Understanding why C++ is the preferred programming language

There are several high-level language choices when it comes to low latency applications – Java, Scala, Go, and C++. In this section, we will discuss why C++ is one of the most popular languages when it comes to low latency applications. We will discuss several characteristics of the C++ language that support the high-level language constructs to support large code bases. The power of C++ is that it also provides very low-level access, similar to the C programming language, to support a very high level of control and optimization.

Compiled language

C++ is a compiled language and not an interpreted language. A compiled language is a programming language where the source code is translated into a machine code binary that is ready to run on a specific architecture. Examples of compiled languages are C, C++, Erlang, Haskell, Rust, and Go. The alternative to compiled languages is interpreted languages. Interpreted languages are different in the sense that the program is run by an interpreter, which runs through the source line by line and executes each command. Some examples of interpreted languages are Ruby, Python, and JavaScript.

Interpreted languages are inherently slower than compiled languages because, unlike compiled languages where the translation into machine instructions is done at compile time, here the interpretation to machine instructions is done at runtime. However, with the development of just-in-time compilation, interpreted languages are not tremendously slower. For compiled languages, the code is already pre-built for the target hardware so there is no extra interpretation step at runtime. Since C++ is a compiled language, it gives the developers a lot of control over the hardware. This means competent developers can optimize things such as memory management, CPU usage, cache performance, and so on. Additionally, since compiled languages are converted into machine code for specific hardware at compile time, it can be optimized to a large degree. Hence, compiled languages in general, and especially C++, are faster and more efficient to execute.

Closer to hardware – low-level language

Compared to other popular programming languages such as Python, Java, and so on, C++ is low level so it’s extremely close to the hardware. This is especially useful when the software is tightly coupled with the target hardware it runs on and possibly even in cases where low-level support is required. Being extremely close to the hardware also means that there is a significant speed advantage when building systems in C++. Especially in low latency applications such as high-frequency trading (HFT) where a few microseconds can make a huge difference, C++ is generally the established gold standard in the industry.

We will discuss an example of how being closer to the hardware helps boost C++ performance over another language such as Java. A C/C++ pointer is the actual address of an object in memory. So, the software can access memory and objects in memory directly without needing extra abstractions that would slow it down. This, however, does mean that the application developer will often have to explicitly manage the creation, ownership, destruction, and lifetime of objects instead of relying on the programming language to manage things for you as in Python or Java. An extreme case of C++ being close to the hardware is that it is possible to call assembly instructions straight from C++ statements – we will see an example of this in later chapters.

Deterministic usage of resources

It is critical for low latency applications to use resources very efficiently. Embedded applications (which are also often used in real-time applications) are especially limited in time and memory resources. In languages such as Java and Python that rely on automatic garbage collection, there is an element of non-determinism – that is, the garbage collector can introduce large latencies in performance unpredictably. Additionally, for systems that are very limited in memory, low-level languages such as C and C++ can do special things such as placing data at custom sections or addresses in memory through pointers. In languages such as C and C++, the programmer is in charge of explicit creation, management, and deallocation of memory resources, allowing for deterministic and efficient use of resources.

Speed and high performance

C++ is faster than most other programming languages for the reasons we have already discussed. It also provides excellent concurrency and multithreading support. Obviously, this is another good feature when it comes to developing low latency applications that are latency-sensitive or even latency-critical. Such requirements are also often found in applications around servers that are under heavy load such as web servers, application servers, database servers, trading servers, and so on.

Another advantage of C++ is due to its compile-time optimization ability. C and C++ support features such as macros or pre-processor directives, a constexpr specifier, and template metaprogramming. These allow us to move a large part of the processing from runtime to compile time. Basically, this means we minimize the work done during runtime on the critical code path by moving a lot of the processing to the compilation step when building the machine code binary. We will discuss these features heavily in later chapters when we build a complete electronic trading system, and their benefits will become very clear at that point.

Language constructs and features

The C++ language itself is a perfect combination of flexibility and feature richness. It allows a lot of freedom for the developers, who can leverage it to tune applications down to a very low level. However, it also provides a lot of higher-level abstractions, which can be used to build very large, feature-rich, versatile, and scalable applications, while still being extremely low latency when required. In this section, we will explore some of those C++-specific language features that put it in a unique position of low-level control and high-level abstraction features.

Portability

First off, C++ is highly portable and can build applications that can be compiled for a lot of different operating systems, platforms, CPU architecture, and so on. Since it does not require a runtime interpreter that differs for different platforms, all that is required to do is build the correct binaries at compile time, which is relatively straightforward, and the final deployed binary can just run on any platform. Additionally, some of the other features we have already discussed (such as the ability to run in low-memory and weaker CPU architectures combined with the lack of garbage collection requirements) make it even more portable than some of the other high-level languages.

Compiler optimizations

We have discussed that C++ is a compiled language, which makes it inherently faster than interpreted languages since it does not incur additional runtime costs. Since the developer’s complete source code is compiled into the final executable binary, compilers have an opportunity to holistically analyze all the objects and code paths. This leads to the possibility of very high levels of optimization at compile times. Modern compilers work closely with modern hardware to produce some surprisingly optimized machine code. The point here is that developers can focus on solving business problems and, assuming the C++ developers are competent, the compiled program is still extremely optimized without requiring a lot of the developer’s time and effort. Since C++ allows you to directly inline assembly code as well, it gives the developers an even greater chance to work with the compiler and produce highly optimized executables.

Statically typed

When it comes to type systems in programming languages, there are two options – statically typed language and dynamically typed language. A statically typed language performs checks around data types (integers, floats, doubles, structures, and classes) and interactions between these types during the compilation process. A dynamically typed language performs these type checks at runtime. Examples of statically typed languages are C++ and Java, and examples of dynamically typed languages are Python, Perl, and JavaScript.

One big benefit of statically typed languages is that since all the type-checking is done at compile time, it gives us the opportunity to find and eliminate many bugs before the program is even run. Obviously, type checking alone cannot find all possible bugs, but the point we’re trying to make here is that statically typed languages do a significantly better job at finding errors and bugs related to types at compile time. This is especially true for low latency applications that are highly numerical in nature.

Another huge benefit of statically typed languages, especially when it comes to low latency applications, is that since the type-checking is done at compile time, there is an additional opportunity for the compiler to optimize the types and type interactions at compile time. In fact, a large part of the reason that compiled languages are much faster is due to the static versus dynamic type-checking system itself. This is also a big reason why, for a dynamically typed language such as Python, high-performance libraries such as NumPy require types when creating arrays and matrices.

Multiple paradigms

Unlike some other languages, C++ does not force the developer to follow a specific programming paradigm. It supports a lot of different programming paradigms such as monolithic, procedural, object-oriented programming (OOP), generic programming, and so on. This makes it a good fit for a wide range of applications because it gives the developer the flexibility to design their program in a way that facilitates maximum optimization and lowest latencies instead of forcing a programming paradigm onto that application.

Libraries

Out of the box, C++ already comes with a large C and C++ library, which provides a lot of data structures, algorithms, and abstractions for tasks such as the following:

Network programming
Dynamic memory management
Numeric operations
Error and exception handling
String operations
Commonly needed algorithms
Input/output (I/O) operations including file operations
Multithreading support

Additionally, the huge community of C++ developers has built and open-sourced a lot of the libraries; we will discuss some of the most popular ones in the following subsections.

Standard Template Library

Standard Template Library (STL) is a very popular and widely used templatized and header-only library containing data structures and containers, iterators and allocators for these containers, and algorithms for tasks such as sorting, searching, algorithms for the containers, and so on.

Boost

Boost is a large C++ library that provides support for multithreading, network operations, image processing, regular expressions (regex), linear algebra, unit testing, and so on.

Asio

Asio (asynchronous input/output) is another well-known and widely used library that comes in two versions: non-Boost and one that is part of the Boost library. It provides support for multithreading concurrency and for implementing and using the asynchronous I/O model and is portable to all major platforms.

GNU Scientific Library

GNU Scientific Library (GSL) provides support for a wide range of mathematical concepts and operations such as complex numbers, matrices, and calculus, and manages other functions.

Active Template Library

Active Template Library (ATL) is a template-heavy C++ library to help program the Component Object Model (COM). It replaces the previous Microsoft Foundation Classes (MFC) library and improves upon it. It is developed by Microsoft and is open source and heavily uses an important low latency C++ feature, the Curiously Recurring Template Pattern (CRTP), which we will also explore and use heavily in this book. It supports COM features such as dual interfaces, ActiveX controls, connection points, tear-off interfaces, COM enumerator interfaces, and a lot more.

Eigen

Eigen is a powerful C++ library for mathematical and scientific applications. It has functions for linear algebra, numerical methods and solvers, numeric types such as complex numbers, features and operations for geometry, and much more.

LAPACK

Linear Algebra Package (LAPACK) is another large and extremely powerful C++ library specifically for linear algebra and linear equations and to support routines for large matrices. It implements a lot of functionality such as solving simultaneous linear equations, least squares methods, eigenvalues, singular value decomposition (SVD), and many more applications.

OpenCV

Open Source Computer Vision (OpenCV) is one of the most well-known C++ libraries when it comes to computer graphics and vision-related applications. It is also available for Java and Python and provides many algorithms for face and object recognition, 3D models, machine learning, deep learning, and more.

mlpack

mlpack is a super-fast, header-only C++ library for a wide variety of machine learning models and the mathematical operations related to them. It also has support for other languages such as Go, Julia, R, and Python.

QT

QT is by far the most popular library when it comes to building cross-platform graphical programs in C++. It works on Windows, Linux, macOS, and even platforms such as Android and embedded systems. It is open source and is used to build GUI widgets.

Crypto++

Crypto++ is a free open source C++ library to support algorithms, operations, and utilities for cryptography. It has many cryptographic algorithms, random number generators, block ciphers, functions, public-key operations, secret sharing, and more across many platforms such as Linux, Windows, macOS, iOS, and Android.

Suitable for big projects

In the previous section, we discussed the design and a lot of features of C++ that make it a great fit for low latency applications. Another aspect of C++ is that because of the flexibility it provides to the developer and all the high-level abstractions it allows you to build, it is actually very well suited to very large real-world projects. Huge projects such as compilers, cloud processing and storage systems, and OSes are built in C++ for these reasons. We will dive into these and many other applications that try to strike a balance between low latency performance, feature richness, and different business cases, and quite often, C++ is the perfect fit for developing such systems.

Mature and large community support

The C programming language was originally created in 1972, and then C++ (originally referred to as C with classes) was created in 1983. C++ is a very mature language and is embedded extensively into many applications in many different business areas. Some examples are the Unix operating system, Oracle MySQL, the Linux kernel, Microsoft Office, and Microsoft Visual Studio – these were all written in C++. The fact that C++ has been around for 40 years means that most software problems have been encountered and solutions have been designed and implemented. C++ is also very popular and taught as part of most computer science degrees and, additionally, has a huge library of developer tools, third-party components, open source projects, libraries, manuals, tutorials, books, and so on dedicated to it. The bottom line is that there is a large amount of documentation, examples, and community support backing up new C++ developers and new C++ projects.

Language under active development

Even though C++ is 40 years old, it is still very much under active development. Ever since the first C++ version was commercially released in 1985, there have been multiple improvements and enhancements to the C++ standard and the language. In chronological order, C++ 98, C++ 03, C++ 0X, C++ 11, C++ 14, C++ 17, and C++ 20 have been released, and C++ 23 is being developed. Each version comes with improvements and new features. So, C++ is a powerful language and is constantly evolving with time and adding modern features. Here is a diagram showing the evolution of C++ over the years:

Figure 1.2 – Evolution of C++

Given the already mature state of the C++ programming language, super-fast speed, perfect combination of high-level abstractions and low-level hardware access and control, huge knowledge base, and developer community along with best practices, libraries, and tools, C++ is a clear pick for low latency application development.

In this section, we looked at the choice of the C++ programming language for low latency application development. We discussed the various characteristics, features, libraries, and community support that make it a great fit for these applications. It is no surprise that C++ is deeply embedded into most applications that have strict performance requirements. In the next section, we will look at a lot of different low latency applications in different business areas with the goal of understanding the similarities that such applications share.