Modern C++ Programming Cookbook - Second Edition

5 (3 reviews total)
By Marius Bancila
    Advance your knowledge in tech with a Packt subscription

  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Learning Modern Core Language Features

About this book

C++ has come a long way to be one of the most widely used general-purpose languages that is fast, efficient, and high-performance at its core.

The updated second edition of Modern C++ Programming Cookbook addresses the latest features of C++20, such as modules, concepts, coroutines, and the many additions to the standard library, including ranges and text formatting. The book is organized in the form of practical recipes covering a wide range of problems faced by modern developers.

The book also delves into the details of all the core concepts in modern C++ programming, such as functions and classes, iterators and algorithms, streams and the file system, threading and concurrency, smart pointers and move semantics, and many others. It goes into the performance aspects of programming in depth, teaching developers how to write fast and lean code with the help of best practices.

Furthermore, the book explores useful patterns and delves into the implementation of many idioms, including pimpl, named parameter, and attorney-client, teaching techniques such as avoiding repetition with the factory pattern. There is also a chapter dedicated to unit testing, where you are introduced to three of the most widely used libraries for C++: Boost.Test, Google Test, and Catch2.

By the end of the book, you will be able to effectively leverage the features and techniques of C++11/14/17/20 programming to enhance the performance, scalability, and efficiency of your applications.

Publication date:
September 2020
Publisher
Packt
Pages
750
ISBN
9781800208988

 

Learning Modern Core Language Features

The C++ language has gone through a major transformation in the past decade with the development and release of C++11 and then, later, with its newer versions: C++14, C++17, and C++20. These new standards have introduced new concepts, simplified and extended existing syntax and semantics, and overall transformed the way we write code. C++11 looks like a new language, and code written using the new standards is called modern C++ code.

The recipes included in this chapter are as follows:

  • Using auto whenever possible
  • Creating type aliases and alias templates
  • Understanding uniform initialization
  • Understanding the various forms of non-static member initialization
  • Controlling and querying object alignment
  • Using scoped enumerations
  • Using override and final for virtual methods
  • Using range-based for loops to iterate on a range
  • Enabling range-based for loops for custom types
  • Using explicit constructors and conversion operators to avoid implicit conversion
  • Using unnamed namespaces instead of static globals
  • Using inline namespaces for symbol versioning
  • Using structured bindings to handle multi-return values
  • Simplifying code with class template argument deduction

Let's start by learning about automatic type deduction.

 

Using auto whenever possible

Automatic type deduction is one of the most important and widely used features in modern C++. The new C++ standards have made it possible to use auto as a placeholder for types in various contexts and let the compiler deduce the actual type. In C++11, auto can be used for declaring local variables and for the return type of a function with a trailing return type. In C++14, auto can be used for the return type of a function without specifying a trailing type and for parameter declarations in lambda expressions. Future standard versions are likely to expand the use of auto to even more cases. The use of auto in these contexts has several important benefits, all of which will be discussed in the How it works... section. Developers should be aware of them, and prefer auto whenever possible. An actual term was coined for this by Andrei Alexandrescu and promoted by Herb Sutter—almost always auto (AAA).

How to do it...

Consider using auto as a placeholder for the actual type in the following situations:

  • To declare local variables with the form auto name = expression when you do not want to commit to a specific type:
    auto i = 42;          // int
    auto d = 42.5;        // double
    auto s = "text";      // char const *
    auto v = { 1, 2, 3 }; // std::initializer_list<int>
    
  • To declare local variables with the auto name = type-id { expression } form when you need to commit to a specific type:
    auto b  = new char[10]{ 0 };            // char*
    auto s1 = std::string {"text"};         // std::string
    auto v1 = std::vector<int> { 1, 2, 3 }; // std::vector<int>
    auto p  = std::make_shared<int>(42);    // std::shared_ptr<int>
    
  • To declare named lambda functions, with the form auto name = lambda-expression, unless the lambda needs to be passed or returned to a function:
    auto upper = [](char const c) {return toupper(c); };
    
  • To declare lambda parameters and return values:
    auto add = [](auto const a, auto const b) {return a + b;};
    
  • To declare a function return type when you don't want to commit to a specific type:
    template <typename F, typename T>
    auto apply(F&& f, T value)
    {
      return f(value);
    }
    

How it works...

The auto specifier is basically a placeholder for an actual type. When using auto, the compiler deduces the actual type from the following instances:

  • From the type of expression used to initialize a variable, when auto is used to declare variables.
  • From the trailing return type or the type of the return expression of a function, when auto is used as a placeholder for the return type of a function.

In some cases, it is necessary to commit to a specific type. For instance, in the first example in the previous section, the compiler deduces the type of s to be char const *. If the intention was to have an std::string, then the type must be specified explicitly. Similarly, the type of v was deduced as std::initializer_list<int>. However, the intention could be to have an std::vector<int>. In such cases, the type must be specified explicitly on the right side of the assignment.

There are some important benefits of using the auto specifier instead of actual types; the following is a list of, perhaps, the most important ones:

  • It is not possible to leave a variable uninitialized. This is a common mistake that developers make when declaring variables specifying the actual type. However, this is not possible with auto, which requires an initialization of the variable in order to deduce the type.
  • Using auto ensures that you always use the correct type and that implicit conversion will not occur. Consider the following example where we retrieve the size of a vector to a local variable. In the first case, the type of the variable is int, though the size() method returns size_t. This means an implicit conversion from size_t to int will occur. However, using auto for the type will deduce the correct type; that is, size_t:
    auto v = std::vector<int>{ 1, 2, 3 };
    // implicit conversion, possible loss of data
    int size1 = v.size();
    // OK
    auto size2 = v.size();
    // ill-formed (warning in gcc/clang, error in VC++)
    auto size3 = int{ v.size() };
    
  • Using auto promotes good object-oriented practices, such as preferring interfaces over implementations. The fewer the number of types specified, the more generic the code is and more open to future changes, which is a fundamental principle of object-oriented programming.
  • It means less typing and less concern for actual types that we don't care about anyway. It is very often the case that even though we explicitly specify the type, we don't actually care about it. A very common case is with iterators, but there are many more. When you want to iterate over a range, you don't care about the actual type of the iterator. You are only interested in the iterator itself; so, using auto saves time used for typing possibly long names and helps you focus on actual code and not type names. In the following example, in the first for loop, we explicitly use the type of the iterator. It is a lot of text to type; the long statements can actually make the code less readable, and you also need to know the type name that you actually don't care about. The second loop with the auto specifier looks simpler and saves you from typing and caring about actual types:
    std::map<int, std::string> m;
    for (std::map<int, std::string>::const_iterator
      it = m.cbegin();
      it != m.cend(); ++it)
    { /*...*/ }
    for (auto it = m.cbegin(); it != m.cend(); ++it)
    { /*...*/ }
    
  • Declaring variables with auto provides a consistent coding style with the type always in the right-hand side. If you allocate objects dynamically, you need to write the type both on the left and right side of the assignment, for example, int* p = new int(42). With auto, the type is specified only once on the right side.

However, there are some gotchas when using auto:

  • The auto specifier is only a placeholder for the type, not for the const/volatile and references specifiers. If you need a const/volatile and/or reference type, then you need to specify them explicitly. In the following example, foo.get() returns a reference to int; when the variable x is initialized from the return value, the type deduced by the compiler is int, not int&. Therefore, any change made to x will not propagate to foo.x_. In order to do so, we should use auto&:
    class foo {
      int x_;
    public:
      foo(int const x = 0) :x_{ x } {}
      int& get() { return x_; }
    };
    foo f(42);
    auto x = f.get();
    x = 100;
    std::cout << f.get() << '\n'; // prints 42
    
  • It is not possible to use auto for types that are not moveable:
    auto ai = std::atomic<int>(42); // error
    
  • It is not possible to use auto for multi-word types, such as long long, long double, or struct foo. However, in the first case, the possible workarounds are to use literals or type aliases; as for the second, using struct/class in that form is only supported in C++ for C compatibility and should be avoided anyway:
    auto l1 = long long{ 42 }; // error
    using llong = long long;
    auto l2 = llong{ 42 };     // OK
    auto l3 = 42LL;            // OK
    
  • If you use the auto specifier but still need to know the type, you can do so in most IDEs by putting the cursor over a variable, for instance. If you leave the IDE, however, that is not possible anymore, and the only way to know the actual type is to deduce it yourself from the initialization expression, which could mean searching through the code for function return types.

The auto can be used to specify the return type from a function. In C++11, this requires a trailing return type in the function declaration. In C++14, this has been relaxed, and the type of the return value is deduced by the compiler from the return expression. If there are multiple return values, they should have the same type:

// C++11
auto func1(int const i) -> int
{ return 2*i; }
// C++14
auto func2(int const i)
{ return 2*i; }

As mentioned earlier, auto does not retain const/volatile and reference qualifiers. This leads to problems with auto as a placeholder for the return type from a function. To explain this, let's consider the preceding example with foo.get(). This time, we have a wrapper function called proxy_get() that takes a reference to a foo, calls get(), and returns the value returned by get(), which is an int&. However, the compiler will deduce the return type of proxy_get() as being int, not int&. Trying to assign that value to an int& fails with an error:

class foo
{
  int x_;
public:
  foo(int const x = 0) :x_{ x } {}
  int& get() { return x_; }
};
auto proxy_get(foo& f) { return f.get(); }
auto f = foo{ 42 };
auto& x = proxy_get(f); // cannot convert from 'int' to 'int &'

To fix this, we need to actually return auto&. However, this is a problem with templates and perfect forwarding the return type without knowing whether it is a value or a reference. The solution to this problem in C++14 is decltype(auto), which will correctly deduce the type:

decltype(auto) proxy_get(foo& f) { return f.get(); }
auto f = foo{ 42 };
decltype(auto) x = proxy_get(f);

The decltype specifier is used to inspect the declared type of an entity or an expression. It's mostly useful when declaring types are cumbersome or not possible at all to declare with the standard notation. Examples of this include declaring lambda types and types that depend on template parameters.

The last important case where auto can be used is with lambdas. As of C++14, both lambda return types and lambda parameter types can be auto. Such a lambda is called a generic lambda because the closure type defined by the lambda has a templated call operator. The following shows a generic lambda that takes two auto parameters and returns the result of applying operator+ to the actual types:

auto ladd = [] (auto const a, auto const b) { return a + b; };
struct
{
  template<typename T, typename U>
  auto operator () (T const a, U const b) const { return a+b; }
} L;

This lambda can be used to add anything for which the operator+ is defined, as shown in the following snippet:

auto i = ladd(40, 2);            // 42
auto s = ladd("forty"s, "two"s); // "fortytwo"s

In this example, we used the ladd lambda to add two integers and to concatenate to std::string objects (using the C++14 user-defined literal operator ""s).

See also

  • Creating type aliases and alias templates to learn about aliases for types
  • Understanding uniform initialization to see how brace-initialization works
 

Creating type aliases and alias templates

In C++, it is possible to create synonyms that can be used instead of a type name. This is achieved by creating a typedef declaration. This is useful in several cases, such as creating shorter or more meaningful names for a type or names for function pointers. However, typedef declarations cannot be used with templates to create template type aliases. An std::vector<T>, for instance, is not a type (std::vector<int> is a type), but a sort of family of all types that can be created when the type placeholder T is replaced with an actual type.

In C++11, a type alias is a name for another already declared type, and an alias template is a name for another already declared template. Both of these types of aliases are introduced with a new using syntax.

How to do it...

  • Create type aliases with the form using identifier = type-id, as in the following examples:
    using byte     = unsigned char;
    using byte_ptr = unsigned char *;
    using array_t  = int[10];
    using fn       = void(byte, double);
    void func(byte b, double d) { /*...*/ }
    byte b{42};
    byte_ptr pb = new byte[10] {0};
    array_t a{0,1,2,3,4,5,6,7,8,9};
    fn* f = func;
    
  • Create alias templates with the form template<template-params-list> identifier = type-id, as in the following examples:
    template <class T>
    class custom_allocator { /* ... */ };
    template <typename T>
    using vec_t = std::vector<T, custom_allocator<T>>;
    vec_t<int>           vi;
    vec_t<std::string>   vs;
    

For consistency and readability, you should do the following:

  • Not mix typedef and using declarations when creating aliases
  • Prefer the using syntax to create names of function pointer types

How it works...

A typedef declaration introduces a synonym (an alias, in other words) for a type. It does not introduce another type (like a class, struct, union, or enum declaration). Type names introduced with a typedef declaration follow the same hiding rules as identifier names. They can also be redeclared, but only to refer to the same type (therefore, you can have valid multiple typedef declarations that introduce the same type name synonym in a translation unit, as long as it is a synonym for the same type). The following are typical examples of typedef declarations:

typedef unsigned char   byte;
typedef unsigned char * byte_ptr;
typedef int             array_t[10];
typedef void(*fn)(byte, double);
template<typename T>
class foo {
  typedef T value_type;
};
typedef std::vector<int> vint_t;

A type alias declaration is equivalent to a typedef declaration. It can appear in a block scope, class scope, or namespace scope. According to C++11 paragraph 7.1.3.2:

A typedef-name can also be introduced by an alias declaration. The identifier following the using keyword becomes a typedef-name and the optional attribute-specifier-seq following the identifier appertains to that typedef-name. It has the same semantics as if it were introduced by the typedef specifier. In particular, it does not define a new type and it shall not appear in the type-id.

An alias declaration is, however, more readable and clearer about the actual type that is aliased when it comes to creating aliases for array types and function pointer types. In the examples from the How to do it... section, it is easily understandable that array_t is a name for the type array of 10 integers, while fn is a name for a function type that takes two parameters of the type byte and double and returns void. This is also consistent with the syntax for declaring std::function objects (for example, std::function<void(byte, double)> f).

It is important to take note of the following things:

  • Alias templates cannot be partially or explicitly specialized.
  • Alias templates are never deduced by template argument deduction when deducing a template parameter.
  • The type produced when specializing an alias template is not allowed to directly or indirectly make use of its own type.

The driving purpose of the new syntax is to define alias templates. These are templates that, when specialized, are equivalent to the result of substituting the template arguments of the alias template for the template parameters in the type-id.

See also

  • Simplifying code with class template argument deduction to learn how to use class templates without explicitly specifying template arguments
 

Understanding uniform initialization

Brace-initialization is a uniform method for initializing data in C++11. For this reason, it is also called uniform initialization. It is arguably one of the most important features from C++11 that developers should understand and use. It removes previous distinctions between initializing fundamental types, aggregate and non-aggregate types, and arrays and standard containers.

Getting ready

To continue with this recipe, you need to be familiar with direct initialization, which initializes an object from an explicit set of constructor arguments, and copy initialization, which initializes an object from another object. The following is a simple example of both types of initialization:

std::string s1("test");   // direct initialization
std::string s2 = "test";  // copy initialization

With these in mind, let's explore how to perform uniform initialization.

How to do it...

To uniformly initialize objects regardless of their type, use the brace-initialization form {}, which can be used for both direct initialization and copy initialization. When used with brace-initialization, these are called direct-list and copy-list-initialization:

T object {other};   // direct-list-initialization
T object = {other}; // copy-list-initialization

Examples of uniform initialization are as follows:

  • Standard containers:
    std::vector<int> v { 1, 2, 3 };
    std::map<int, std::string> m { {1, "one"}, { 2, "two" }};
    
  • Dynamically allocated arrays:
    int* arr2 = new int[3]{ 1, 2, 3 };
    
  • Arrays:
    int arr1[3] { 1, 2, 3 };
    
  • Built-in types:
    int i { 42 };
    double d { 1.2 };
    
  • User-defined types:
    class foo
    {
      int a_;
      double b_;
    public:
      foo():a_(0), b_(0) {}
      foo(int a, double b = 0.0):a_(a), b_(b) {}
    };
    foo f1{};
    foo f2{ 42, 1.2 };
    foo f3{ 42 };
    
  • User-defined POD types:
    struct bar { int a_; double b_;};
    bar b{ 42, 1.2 };
    

How it works...

Before C++11, objects required different types of initialization based on their type:

  • Fundamental types could be initialized using assignment:
    int a = 42;
    double b = 1.2;
    
  • Class objects could also be initialized using assignment from a single value if they had a conversion constructor (prior to C++11, a constructor with a single parameter was called a conversion constructor):
    class foo
    {
      int a_;
    public:
      foo(int a):a_(a) {}
    };
    foo f1 = 42;
    
  • Non-aggregate classes could be initialized with parentheses (the functional form) when arguments were provided and only without any parentheses when default initialization was performed (call to the default constructor). In the next example, foo is the structure defined in the How to do it... section:
    foo f1;           // default initialization
    foo f2(42, 1.2);
    foo f3(42);
    foo f4();         // function declaration
    
  • Aggregate and POD types could be initialized with brace-initialization. In the following example, bar is the structure defined in the How to do it... section:
    bar b = {42, 1.2};
    int a[] = {1, 2, 3, 4, 5};
    

A Plain Old Data (POD) type is a type that is both trivial (has special members that are compiler-provided or explicitly defaulted and occupy a contiguous memory area) and has a standard layout (a class that does not contain language features, such as virtual functions, which are incompatible with the C language, and all members have the same access control). The concept of POD types has been deprecated in C++20 in favor of trivial and standard layout types.

Apart from the different methods of initializing the data, there are also some limitations. For instance, the only way to initialize a standard container (apart from copy constructing) is to first declare an object and then insert elements into it; std::vector was an exception because it is possible to assign values from an array that can be initialized prior using aggregate initialization. On the other hand, however, dynamically allocated aggregates could not be initialized directly.

All the examples in the How to do it... section use direct initialization, but copy initialization is also possible with brace-initialization. These two forms, direct and copy initialization, may be equivalent in most cases, but copy initialization is less permissive because it does not consider explicit constructors in its implicit conversion sequence, which must produce an object directly from the initializer, whereas direct initialization expects an implicit conversion from the initializer to an argument of the constructor. Dynamically allocated arrays can only be initialized using direct initialization.

Of the classes shown in the preceding examples, foo is the one class that has both a default constructor and a constructor with parameters. To use the default constructor to perform default initialization, we need to use empty braces; that is, {}. To use the constructor with parameters, we need to provide the values for all the arguments in braces {}. Unlike non-aggregate types, where default initialization means invoking the default constructor, for aggregate types, default initialization means initializing with zeros.

Initialization of standard containers, such as the vector and the map, also shown previously, is possible because all standard containers have an additional constructor in C++11 that takes an argument of the type std::initializer_list<T>. This is basically a lightweight proxy over an array of elements of the type T const. These constructors then initialize the internal data from the values in the initializer list.

The way initialization using std::initializer_list works is as follows:

  • The compiler resolves the types of the elements in the initialization list (all the elements must have the same type).
  • The compiler creates an array with the elements in the initializer list.
  • The compiler creates an std::initializer_list<T> object to wrap the previously created array.
  • The std::initializer_list<T> object is passed as an argument to the constructor.

An initializer list always takes precedence over other constructors where brace-initialization is used. If such a constructor exists for a class, it will be called when brace-initialization is performed:

class foo
{
  int a_;
  int b_;
public:
  foo() :a_(0), b_(0) {}
  foo(int a, int b = 0) :a_(a), b_(b) {}
  foo(std::initializer_list<int> l) {}
};
foo f{ 1, 2 }; // calls constructor with initializer_list<int>

The precedence rule applies to any function, not just constructors. In the following example, two overloads of the same function exist. Calling the function with an initializer list resolves to a call to the overload with an std::initializer_list:

void func(int const a, int const b, int const c)
{
  std::cout << a << b << c << '\n';
}
void func(std::initializer_list<int> const list)
{
  for (auto const & e : list)
    std::cout << e << '\n';
}
func({ 1,2,3 }); // calls second overload

This, however, has the potential of leading to bugs. Let's take, for example, the std::vector type. Among the constructors of the vector, there is one that has a single argument, representing the initial number of elements to be allocated, and another one that has an std::initializer_list as an argument. If the intention is to create a vector with a preallocated size, using brace-initialization will not work as the constructor with the std::initializer_list will be the best overload to be called:

std::vector<int> v {5};

The preceding code does not create a vector with five elements, but a vector with one element with a value of 5. To be able to actually create a vector with five elements, initialization with the parentheses form must be used:

std::vector<int> v (5);

Another thing to note is that brace-initialization does not allow narrowing conversion. According to the C++ standard (refer to paragraph 8.5.4 of the standard), a narrowing conversion is an implicit conversion:

- From a floating-point type to an integer type.

- From long double to double or float, or from double to float, except where the source is a constant expression and the actual value after conversion is within the range of values that can be represented (even if it cannot be represented exactly).

- From an integer type or unscoped enumeration type to a floating-point type, except where the source is a constant expression and the actual value after conversion will fit into the target type and will produce the original value when converted to its original type.

- From an integer type or unscoped enumeration type to an integer type that cannot represent all the values of the original type, except where the source is a constant expression and the actual value after conversion will fit into the target type and will produce the original value when converted to its original type.

The following declarations trigger compiler errors because they require a narrowing conversion:

int i{ 1.2 };           // error
double d = 47 / 13;
float f1{ d };          // error

To fix this error, an explicit conversion must be done:

int i{ static_cast<int>(1.2) };
double d = 47 / 13;
float f1{ static_cast<float>(d) };

A brace-initialization list is not an expression and does not have a type. Therefore, decltype cannot be used on a brace-init-list, and template type deduction cannot deduce the type that matches a brace-init-list.

Let's consider one more example:

float f2{47/13};        // OK, f2=3

The preceding declaration is, however, correct because an implicit conversion from int to float exists. The expression 47/13 is first evaluated to integer value 3, which is then assigned to the variable f2 of the type float.

There's more...

The following example shows several examples of direct-list-initialization and copy-list-initialization. In C++11, the deduced type of all these expressions is std::initializer_list<int>:

auto a = {42};   // std::initializer_list<int>
auto b {42};     // std::initializer_list<int>
auto c = {4, 2}; // std::initializer_list<int>
auto d {4, 2};   // std::initializer_list<int>

C++17 has changed the rules for list initialization, differentiating between the direct- and copy-list-initialization. The new rules for type deduction are as follows:

  • For copy-list-initialization, auto deduction will deduce an std::initializer_list<T> if all the elements in the list have the same type, or be ill-formed.
  • For direct-list-initialization, auto deduction will deduce a T if the list has a single element, or be ill-formed if there is more than one element.

Based on these new rules, the previous examples would change as follows (the deduced type is mentioned in comments):

auto a = {42};   // std::initializer_list<int>
auto b {42};     // int
auto c = {4, 2}; // std::initializer_list<int>
auto d {4, 2};   // error, too many

In this case, a and c are deduced as std::initializer_list<int>, b is deduced as an int, and d, which uses direct initialization and has more than one value in the brace-init-list, triggers a compiler error.

See also

  • Using auto whenever possible to understand how automatic type deduction works in C++
  • Understanding the various forms of non-static member initialization to learn how to best perform initialization of class members
 

Understanding the various forms of non-static member initialization

Constructors are places where non-static class member initialization is done. Many developers prefer assignments in the constructor body. Aside from the several exceptional cases when that is actually necessary, initialization of non-static members should be done in the constructor's initializer list or, as of C++11, using default member initialization when they are declared in the class. Prior to C++11, constants and non-constant non-static data members of a class had to be initialized in the constructor. Initialization on declaration in a class was only possible for static constants. As we will see here, this limitation was removed in C++11, which allows the initialization of non-statics in the class declaration. This initialization is called default member initialization and is explained in the following sections.

This recipe will explore the ways non-static member initialization should be done. Using the appropriate initialization method for each member leads not only to more efficient code, but also to better organized and more readable code.

How to do it...

To initialize non-static members of a class, you should:

  • Use default member initialization for constants, both static and non-static (see [1] and [2] in the following code).
  • Use default member initialization to provide default values for members of classes with multiple constructors that would use a common initializer for those members (see [3] and [4] in the following code).
  • Use the constructor initializer list to initialize members that don't have default values, but depend on constructor parameters (see [5] and [6] in the following code).
  • Use assignment in constructors when the other options are not possible (examples include initializing data members with the pointer this, checking constructor parameter values, and throwing exceptions prior to initializing members with those values or self-references of two non-static data members).

The following example shows these forms of initialization:

struct Control
{
  const int DefaultHeight = 14;                                  // [1]
  const int DefaultWidth  = 80;                                  // [2]
  TextVerticalAligment   valign = TextVerticalAligment::Middle;  // [3]
  TextHorizontalAligment halign = TextHorizontalAligment::Left;  // [4]
  std::string text;
  Control(std::string const & t) : text(t)      // [5]
  {}
  Control(std::string const & t,
    TextVerticalAligment const va,
    TextHorizontalAligment const ha):
    text(t), valign(va), halign(ha)             // [6]
  {}
};

How it works...

Non-static data members are supposed to be initialized in the constructor's initializer list, as shown in the following example:

struct Point
{
  double X, Y;
  Point(double const x = 0.0, double const y = 0.0) : X(x), Y(y)  {}
};

Many developers, however, do not use the initializer list, but prefer assignments in the constructor's body, or even mix assignments and the initializer list. That could be for several reasons—for larger classes with many members, the constructor assignments may look easier to read than long initializer lists, perhaps split on many lines, or it could be because they are familiar with other programming languages that don't have an initializer list. It also could also happen, unfortunately, for various reasons they don't even know about it.

It is important to note that the order in which non-static data members are initialized is the order in which they were declared in the class definition, and not the order of their initialization in a constructor initializer list. On the other hand, the order in which non-static data members are destroyed is the reversed order of construction.

Using assignments in the constructor is not efficient, as this can create temporary objects that are later discarded. If not initialized in the initializer list, non-static members are initialized via their default constructor and then, when assigned a value in the constructor's body, the assignment operator is invoked. This can lead to inefficient work if the default constructor allocates a resource (such as memory or a file) and that has to be deallocated and reallocated in the assignment operator. This is exemplified in the following snippet:

struct foo
{
  foo()
  { std::cout << "default constructor\n"; }
  foo(std::string const & text)
  { std::cout << "constructor '" << text << "\n"; }
  foo(foo const & other)
  { std::cout << "copy constructor\n"; }
  foo(foo&& other)
  { std::cout << "move constructor\n"; };
  foo& operator=(foo const & other)
  { std::cout << "assignment\n"; return *this; }
  foo& operator=(foo&& other)
  { std::cout << "move assignment\n"; return *this;}
  ~foo()
  { std::cout << "destructor\n"; }
};
struct bar
{
  foo f;
  bar(foo const & value)
  {
    f = value;
  }
};
foo f;
bar b(f);

The preceding code produces the following output, showing how the data member f is first default initialized and then assigned a new value:

default constructor
default constructor
assignment
destructor
destructor

Changing the initialization from the assignment in the constructor body to the initializer list replaces the calls to the default constructor, plus the assignment operator, with a call to the copy constructor:

bar(foo const & value) : f(value) { }

Adding the preceding line of code produces the following output:

default constructor
copy constructor
destructor
destructor

For those reasons, at least for types other than the built-in types (such as bool, char, int, float, double, or pointers), you should prefer the constructor initializer list. However, to be consistent with your initialization style, you should always prefer the constructor initializer list when possible. There are several situations when using the initializer list is not possible; these include the following cases (but the list could be expanded for other cases):

  • If a member has to be initialized with a pointer or reference to the object that contains it, using the this pointer in the initialization list may trigger a warning with some compilers that it is used before the object is constructed.
  • If you have two data members that must contain references to each other.
  • If you want to test an input parameter and throw an exception before initializing a non-static data member with the value of the parameter.

Starting with C++11, non-static data members can be initialized when declared in the class. This is called default member initialization because it is supposed to represent initialization with default values. Default member initialization is intended for constants and for members that are not initialized based on constructor parameters (in other words, members whose value does not depend on the way the object is constructed):

enum class TextFlow { LeftToRight, RightToLeft };
struct Control
{
  const int DefaultHeight = 20;
  const int DefaultWidth = 100;
  TextFlow textFlow = TextFlow::LeftToRight;
  std::string text;
  Control(std::string t) : text(t)
  {}
};

In the preceding example, DefaultHeight and DefaultWidth are both constants; therefore, the values do not depend on the way the object is constructed, so they are initialized when declared. The textFlow object is a non-constant non-static data member whose value also does not depend on the way the object is initialized (it could be changed via another member function); therefore, it is also initialized using default member initialization when it is declared. text, on the other hand, is also a non-constant non-static data member, but its initial value depends on the way the object is constructed.

Therefore, it is initialized in the constructor's initializer list using a value passed as an argument to the constructor.

If a data member is initialized both with the default member initialization and constructor initializer list, the latter takes precedence and the default value is discarded. To exemplify this, let's again consider the foo class mentioned earlier and the following bar class, which uses it:

struct bar
{
  foo f{"default value"};
  bar() : f{"constructor initializer"}
  {
  }
};
bar b;

The output differs, in this case, as follows:

constructor 'constructor initializer'
destructor

The reason for the different behavior is that the value from the default initializer list is discarded, and the object is not initialized twice.

See also

  • Understanding uniform initialization to see how brace-initialization works
 

Controlling and querying object alignment

C++11 provides standardized methods for specifying and querying the alignment requirements of a type (something that was previously possible only through compiler-specific methods). Controlling the alignment is important in order to boost performance on different processors and enable the use of some instructions that only work with data on particular alignments.

For example, Intel Streaming SIMD Extensions (SSE) and Intel SSE2, which are a set of processor instructions that can greatly increase performance when the same operations are to be applied on multiple data objects, require 16 bytes of alignment of data. On the other hand, for Intel Advanced Vector Extensions (or Intel AVX), which expands most integer processor commands to 256 bits, it is highly recommended to use 32 bytes alignment. This recipe explores the alignas specifier for controlling the alignment requirements and the alignof operator, which retrieves the alignment requirements of a type.

Getting ready

You should be familiar with what data alignment is and the way the compiler performs default data alignment. However, basic information about the latter is provided in the How it works... section.

How to do it...

  • To control the alignment of a type (both at the class level or data member level) or an object, use the alignas specifier:
    struct alignas(4) foo
    {
      char a;
      char b;
    };
    struct bar
    {
      alignas(2) char a;
      alignas(8) int  b;
    };
    alignas(8)   int a;
    alignas(256) long b[4];
    
  • To query the alignment of a type, use the alignof operator:
    auto align = alignof(foo);
    

How it works...

Processors do not access memory one byte at a time, but in larger chunks of powers of two (2, 4, 8, 16, 32, and so on). Owing to this, it is important that compilers align data in memory so that it can be easily accessed by the processor. Should this data be misaligned, the compiler has to do extra work to access data; it has to read multiple chunks of data, shift and discard unnecessary bytes, and combine the rest.

C++ compilers align variables based on the size of their data type. The standard only specifies the sizes of char, signed char, unsigned char, char8_t, and std::byte, which must be 1. It also requires that the size of short must be at least 16 bits, the size of long must be at least 32 bits, and that the size of long long must be at least 64 bits. It also requires that 1 == sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long). Therefore, the size of most types are compiler-specific and may depend on the platform. Typically, these are 1 byte for bool and char, 2 bytes for short, 4 bytes for int, long, and float, 8 bytes for double and long long, and so on. When it comes to structures or unions, the alignment must match the size of the largest member in order to avoid performance issues. To exemplify this, let's consider the following data structures:

struct foo1    // size = 1, alignment = 1
{              // foo1:    +-+
  char a;      // members: |a|
};
struct foo2    // size = 2, alignment = 1
{              // foo2:    +-+-+
  char a;      // members  |a|b|
  char b;
};
struct foo3    // size = 8, alignment = 4
{              // foo3:    +----+----+
  char a;      // members: |a...|bbbb|
  int  b;      // . represents a byte of padding
};

foo1 and foo2 are different sizes, but the alignment is the same—that is, 1—because all data members are of the type char, which has a size of 1 byte. In the structure foo3, the second member is an integer, whose size is 4. As a result, the alignment of members of this structure is done at addresses that are multiples of 4. To achieve this, the compiler introduces padding bytes.

The structure foo3 is actually transformed into the following:

struct foo3_
{
  char a;        // 1 byte
  char _pad0[3]; // 3 bytes padding to put b on a 4-byte boundary
  int  b;        // 4 bytes
};

Similarly, the following structure has a size of 32 bytes and an alignment of 8; that is because the largest member is a double whose size is 8. This structure, however, requires padding in several places to make sure that all the members can be accessed at addresses that are multiples of 8:

struct foo4     // size = 24, alignment = 8
{               // foo4:    +--------+--------+--------+--------+
  int a;        // members: |aaaab...|cccc....|dddddddd|e.......|
  char b;       // . represents a byte of padding
  float c;
  double d;
  bool e;
};

The equivalent structure that's created by the compiler is as follows:

struct foo4_
{
  int a;         // 4 bytes
  char b;        // 1 byte
  char _pad0[3]; // 3 bytes padding to put c on a 8-byte boundary
  float c;       // 4 bytes
  char _pad1[4]; // 4 bytes padding to put d on a 8-byte boundary
  double d;      // 8 bytes
  bool e;        // 1 byte
  char _pad2[7]; // 7 bytes padding to make sizeof struct multiple of 8
};

In C++11, specifying the alignment of an object or type is done using the alignas specifier. This can take either an expression (an integral constant expression that evaluates to 0 or a valid value for an alignment), a type-id, or a parameter pack. The alignas specifier can be applied to the declaration of a variable or a class data member that does not represent a bit field, or to the declaration of a class, union, or enumeration.

The type or object on which an alignas specification is applied will have the alignment requirement equal to the largest, greater than zero, expression of all alignas specifications used in the declaration.

There are several restrictions when using the alignas specifier:

  • The only valid alignments are the powers of two (1, 2, 4, 8, 16, 32, and so on). Any other values are illegal, and the program is considered ill-formed; that doesn't necessarily have to produce an error, as the compiler may choose to ignore the specification.
  • An alignment of 0 is always ignored.
  • If the largest alignas on a declaration is smaller than the natural alignment without any alignas specifier, then the program is also considered ill-formed.

In the following example, the alignas specifier has been applied to a class declaration. The natural alignment without the alignas specifier would have been 1, but with alignas(4), it becomes 4:

struct alignas(4) foo
{
  char a;
  char b;
};

In other words, the compiler transforms the preceding class into the following:

struct foo
{
  char a;
  char b;
  char _pad0[2];
};

The alignas specifier can be applied both to the class declaration and the member data declarations. In this case, the strictest (that is, largest) value wins. In the following example, member a has a natural size of 1 and requires an alignment of 2; member b has a natural size of 4 and requires an alignment of 8, so the strictest alignment would be 8. The alignment requirement of the entire class is 4, which is weaker (that is, smaller) than the strictest required alignment and therefore it will be ignored, though the compiler will produce a warning:

struct alignas(4) foo
{
  alignas(2) char a;
  alignas(8) int  b;
};

The result is a structure that looks like this:

struct foo
{
  char a;
  char _pad0[7];
  int b;
  char _pad1[4];
};

The alignas specifier can also be applied to variables. In the following example, variable a, which is an integer, is required to be placed in memory at a multiple of 8. The next variable, the array of 4 longs, is required to be placed in memory at a multiple of 256. As a result, the compiler will introduce up to 244 bytes of padding between the two variables (depending on where in memory, at an address multiple of 8, variable a is located):

alignas(8)   int a;
alignas(256) long b[4];
printf("%p\n", &a); // eg. 0000006C0D9EF908
printf("%p\n", &b); // eg. 0000006C0D9EFA00

Looking at the addresses, we can see that the address of a is indeed a multiple of 8, and that the address of b is a multiple of 256 (hexadecimal 100).

To query the alignment of a type, we use the alignof operator. Unlike sizeof, this operator can only be applied to type-ids, and not to variables or class data members. The types it can be applied to can be complete types, an array type, or a reference type. For arrays, the value that's returned is the alignment of the element type; for references, the value that's returned is the alignment of the referenced type. Here are several examples:

Expression

Evaluation

alignof(char)

1, because the natural alignment of char is 1.

alignof(int)

4, because the natural alignment of int is 4.

alignof(int*)

4 on 32-bit, 8 on 64-bit, the alignment for pointers.

alignof(int[4])

4, because the natural alignment of the element type is 4.

alignof(foo&)

8, because the specified alignment for the class foo, which is the referred type (as shown in the previous example), was 8.

The alignas specifier is useful if you wish to force an alignment for a data type (taking into consideration the restriction mentioned previously) so that variables of that type can be accessed and copied efficiently. This means optimizing CPU reads and writes and avoiding unnecessary invalidation from cache lines. This can be highly important in some categories of applications where performance is key, such as games or trading applications. On the other hand, the alignof operator retries the minimum alignment requirement of a specified type.

See also

  • Creating type aliases and alias templates to learn about aliases for types
 

Using scoped enumerations

Enumeration is a basic type in C++ that defines a collection of values, always of an integral underlying type. Their named values, which are constant, are called enumerators. Enumerations declared with the keyword enum are called unscoped enumerations, while enumerations declared with enum class or enum struct are called scoped enumerations. The latter ones were introduced in C++11 and are intended to solve several problems with unscoped enumerations, which are explained in this recipe.

How to do it...

When working with enumerations, you should:

  • Prefer to use scoped enumerations instead of unscoped ones
  • Declare scoped enumerations using enum class or enum struct:
    enum class Status { Unknown, Created, Connected };
    Status s = Status::Created;
    

    The enum class and enum struct declarations are equivalent, and throughout this recipe and the rest of this book, we will use enum class.

Because scope enumerations are restricted namespaces, the C++20 standard allows us to associate them with a using directive. You can do the following:

  • Introduce a scoped enumeration identifier in the local scope with a using directive, as follows:
    int main()
    {
      using Status::Unknown;
      Status s = Unknown;
    }
    
  • Introduce all the identifiers of a scoped enumeration in the local scope with a using directive, as follows:
    struct foo
    {
      enum class Status { Unknown, Created, Connected };
      using enum Status;
    };
    foo::Status s = foo::Created; // instead of
                                  // foo::Status::Created
    
  • Use a using enum directive to introduce the enum identifiers in a switch statement to simplify your code:
    void process(Status const s)
    {
      switch (s)
      {
        using enum Status;
        case Unknown:   /*...*/ break;
        case Created:   /*...*/ break;
        case Connected: /*...*/ break;
      }
    }
    

How it works...

Unscoped enumerations have several issues that create problems for developers:

  • They export their enumerators to the surrounding scope (for which reason, they are called unscoped enumerations), and that has the following two drawbacks:
    1. It can lead to name clashes if two enumerations in the same namespace have enumerators with the same name, and
    2. It's not possible to use an enumerator using its fully qualified name:
      enum Status {Unknown, Created, Connected};
      enum Codes {OK, Failure, Unknown};   // error
      auto status = Status::Created;       // error
      
  • Prior to C++ 11, they could not specify the underlying type, which is required to be an integral type. This type must not be larger than int, unless the enumerator value cannot fit a signed or unsigned integer. Owing to this, forward declaration of enumerations was not possible. The reason for this was that the size of the enumeration was not known. This was because the underlying type was not known until the values of the enumerators were defined so that the compiler could pick the appropriate integer type. This has been fixed in C++11.
  • Values of enumerators implicitly convert to int. This means you can intentionally or accidentally mix enumerations that have a certain meaning and integers (which may not even be related to the meaning of the enumeration) and the compiler will not be able to warn you:
    enum Codes { OK, Failure };
    void include_offset(int pixels) {/*...*/}
    include_offset(Failure);
    

The scoped enumerations are basically strongly typed enumerations that behave differently than the unscoped enumerations:

  • They do not export their enumerators to the surrounding scope. The two enumerations shown earlier would change to the following, no longer generating a name collision and being possible to fully qualify the names of the enumerators:
    enum class Status { Unknown, Created, Connected };
    enum class Codes { OK, Failure, Unknown }; // OK
    Codes code = Codes::Unknown;               // OK
    
  • You can specify the underlying type. The same rules for underlying types of unscoped enumerations apply to scoped enumerations too, except that the user can explicitly specify the underlying type. This also solves the problem with forward declarations since the underlying type can be known before the definition is available:
    enum class Codes : unsigned int;
    void print_code(Codes const code) {}
    enum class Codes : unsigned int
    {
      OK = 0,
      Failure = 1,
      Unknown = 0xFFFF0000U
    };
    
  • Values of scoped enumerations no longer convert implicitly to int. Assigning the value of an enum class to an integer variable would trigger a compiler error unless an explicit cast is specified:
    Codes c1 = Codes::OK;                       // OK
    int c2 = Codes::Failure;                    // error
    int c3 = static_cast<int>(Codes::Failure);  // OK
    

However, the scoped enumerations have a drawback: they are restricted namespaces. They do not export the identifiers in the outer scope, which can be inconvenient at times. For instance, if you are writing a switch and you need to repeat the enumeration name for each case label, as in the following example:

std::string_view to_string(Status const s)
{
  switch (s)
  {
    case Status::Unknown:   return "Unknown";
    case Status::Created:   return "Created";
    case Status::Connected: return "Connected";
  }
}

In C++20, this can be simplified with the help of a using directive with the name of the scoped enumeration. The preceding code can be simplified as follows:

std::string_view to_string(Status const s)
{
  switch (s)
  {
    using enum Status;
    case Unknown:   return "Unknown";
    case Created:   return "Created";
    case Connected: return "Connected";
  }
}

The effect of this using directive is that all the enumerator identifiers are introduced in the local scope, making it possible to refer to them with the unqualified form. It is also possible to bring only a particular enum identifier to the local scope with a using directive with the qualified identifier name, such as using Status::Connected.

See also

  • Creating compile-time constant expressions in Chapter 9, Robustness and Performance to learn how to work with compile-time constants
 

Using override and final for virtual methods

Unlike other similar programming languages, C++ does not have a specific syntax for declaring interfaces (which are basically classes with pure virtual methods only) and also has some deficiencies related to how virtual methods are declared. In C++, the virtual methods are introduced with the virtual keyword. However, the keyword virtual is optional for declaring overrides in derived classes, which can lead to confusion when dealing with large classes or hierarchies. You may need to navigate throughout the hierarchy up to the base to figure out whether a function is virtual or not. On the other hand, sometimes, it is useful to make sure that a virtual function or even a derived class can no longer be overridden or derived further. In this recipe, we will see how to use the C++11 special identifiers override and final to declare virtual functions or classes.

Getting ready

You should be familiar with inheritance and polymorphism in C++ and concepts such as abstract classes, pure specifiers, virtual, and overridden methods.

How to do it...

To ensure the correct declaration of virtual methods both in base and derived classes, but also that you increase readability, do the following:

  • Prefer to use the virtual keyword when declaring virtual functions in derived classes that are supposed to override virtual functions from a base class.
  • Always use the override special identifier after the declarator part of a virtual function's declaration or definition:
    class Base
    {
      virtual void foo() = 0;
      virtual void bar() {}
      virtual void foobar() = 0;
    };
    void Base::foobar() {}
    class Derived1 : public Base
    {
      virtual void foo() override = 0;
      virtual void bar() override {}
      virtual void foobar() override {}
    };
    class Derived2 : public Derived1
    {
      virtual void foo() override {}
    };
    

The declarator is the part of the type of a function that excludes the return type.

To ensure that functions cannot be overridden further or that classes cannot be derived any more, use the final special identifier:

  • After the declarator part of a virtual function declaration or definition to prevent further overrides in a derived class:
    class Derived2 : public Derived1
    {
      virtual void foo() final {}
    };
    
  • After the name of a class in the declaration of the class to prevent further derivations of the class:
    class Derived4 final : public Derived1
    {
      virtual void foo() override {}
    };
    

How it works...

The way override works is very simple; in a virtual function declaration or definition, it ensures that the function is actually overriding a base class function; otherwise, the compiler will trigger an error.

It should be noted that both the override and final keywords are special identifiers that have a meaning only in a member function declaration or definition. They are not reserved keywords and can still be used elsewhere in a program as user-defined identifiers.

Using the override special identifier helps the compiler detect situations where a virtual method does not override another one, as shown in the following example:

class Base
{
public:
  virtual void foo() {}
  virtual void bar() {}
};
class Derived1 : public Base
{
public:
  void foo() override {}
  // for readability use the virtual keyword
  virtual void bar(char const c) override {}
  // error, no Base::bar(char const)
};

Without the presence of the override specifier, the virtual bar(char const) method of the Derived1 class would not be an overridden method, but an overload of the bar() from Base.

The other special identifier, final, is used in a member function declaration or definition to indicate that the function is virtual and cannot be overridden in a derived class. If a derived class attempts to override the virtual function, the compiler triggers an error:

class Derived2 : public Derived1
{
  virtual void foo() final {}
};
class Derived3 : public Derived2
{
  virtual void foo() override {} // error
};

The final specifier can also be used in a class declaration to indicate that it cannot be derived:

class Derived4 final : public Derived1
{
  virtual void foo() override {}
};
class Derived5 : public Derived4 // error
{
};

Since both override and final have this special meaning when used in the defined context and are not, in fact, reserved keywords, you can still use them anywhere else in the C++ code. This ensured that existing code written before C++11 did not break because of the use of these names for identifiers:

class foo
{
  int final = 0;
  void override() {}
};

Although the recommendation given earlier suggesting using both virtual and override in the declaration of an overridden virtual method, the virtual keyword is optional, and can be omitted to shorten the declaration. The presence of the override specifier should be enough to indicate to the reader that the method is virtual. This is rather a matter of personal preference and does not affect the semantics.

See also

  • Static polymorphism with the curiously recurring template pattern in Chapter 10, Implementing Patterns and Idioms to learn how the CRTP pattern helps with implementing polymorphism at compile time
 

Using range-based for loops to iterate on a range

Many programming languages support a variant of a for loop called for each; that is, repeating a group of statements over the elements of a collection. C++ did not have core language support for this until C++11. The closest feature was the general-purpose algorithm from the standard library called std::for_each, which applies a function to all the elements in a range. C++11 brought language support for for each that's actually called range-based for loops. The new C++17 standard provides several improvements for the original language feature.

Getting ready

In C++11, a range-based for loop has the following general syntax:

for ( range_declaration : range_expression ) loop_statement

To exemplify the various ways of using range-based for loops, we will use the following functions, which return sequences of elements:

std::vector<int> getRates()
{
  return std::vector<int> {1, 1, 2, 3, 5, 8, 13};
}
std::multimap<int, bool> getRates2()
{
  return std::multimap<int, bool> {
    { 1, true },
    { 1, true },
    { 2, false },
    { 3, true },
    { 5, true },
    { 8, false },
    { 13, true }
  };
}

In the next section, we'll look at the various ways we can use range-based for loops.

How to do it...

Range-based for loops can be used in various ways:

  • By committing to a specific type for the elements of the sequence:
    auto rates = getRates();
    for (int rate : rates)
      std::cout << rate << '\n';
    for (int& rate : rates)
      rate *= 2;
    
  • By not specifying a type and letting the compiler deduce it:
    for (auto&& rate : getRates())
      std::cout << rate << '\n';
    for (auto & rate : rates)
      rate *= 2;
    for (auto const & rate : rates)
      std::cout << rate << '\n';
    
  • By using structured bindings and decomposition declaration in C++17:
    for (auto&& [rate, flag] : getRates2())
      std::cout << rate << '\n';
    

How it works...

The expression for the range-based for loops shown earlier in the How to do it... section is basically syntactic sugar as the compiler transforms it into something else. Before C++17, the code generated by the compiler used to be the following:

{
  auto && __range = range_expression;
  for (auto __begin = begin_expr, __end = end_expr;
  __begin != __end; ++__begin) {
    range_declaration = *__begin;
    loop_statement
  }
}

What begin_expr and end_expr are in this code depends on the type of the range:

  • For C-like arrays: __range and __range + __bound (where __bound is the number of elements in the array).
  • For a class type with begin and end members (regardless of their type and accessibility): __range.begin() and __range.end().
  • For others, it is begin(__range) and end(__range), which are determined via argument-dependent lookup.

It is important to note that if a class contains any members (function, data member, or enumerators) called begin or end, regardless of their type and accessibility, they will be picked for begin_expr and end_expr. Therefore, such a class type cannot be used in range-based for loops.

In C++17, the code generated by the compiler is slightly different:

{
  auto && __range = range_expression;
  auto __begin = begin_expr;
  auto __end = end_expr;
  for (; __begin != __end; ++__begin) {
    range_declaration = *__begin;
    loop_statement
  }
}

The new standard has removed the constraint that the begin expression and the end expression must be the same type. The end expression does not need to be an actual iterator, but it has to be able to be compared for inequality with an iterator. A benefit of this is that the range can be delimited by a predicate. On the other hand, the end expression is only evaluated once, and not every time the loop is iterated, which could potentially increase performance.

See also

  • Enabling range-based for loops for custom types to see how to make it possible for user-defined types to be used with range-based for loops
  • Iterating over collections with the ranges library in Chapter 12, C++20 Core Features, to learn about the fundamentals of the C++20 ranges library
  • Creating your own range view in Chapter 12, C++20 Core Features, to see how to extend the C++20 range library's capabilities with user-defined range adaptors
 

Enabling range-based for loops for custom types

As we saw in the preceding recipe, range-based for loops, known as for each in other programming languages, allow you to iterate over the elements of a range, providing a simplified syntax over the standard for loops and making the code more readable in many situations. However, range-based for loops do not work out of the box with any type representing a range, but require the presence of a begin() and end() function (for non-array types) either as a member or free function. In this recipe, we will learn how to enable a custom type to be used in range-based for loops.

Getting ready

It is recommended that you read the Using range-based for loops to iterate on a range recipe before continuing with this one if you need to understand how range-based for loops work, as well as what code the compiler generates for such a loop.

To show how we can enable range-based for loops for custom types representing sequences, we will use the following implementation of a simple array:

template <typename T, size_t const Size>
class dummy_array
{
  T data[Size] = {};
public:
  T const & GetAt(size_t const index) const
  {
    if (index < Size) return data[index];
    throw std::out_of_range("index out of range");
  }
  void SetAt(size_t const index, T const & value)
  {
    if (index < Size) data[index] = value;
    else throw std::out_of_range("index out of range");
  }
  size_t GetSize() const { return Size; }
};

The purpose of this recipe is to enable writing code like the following:

dummy_array<int, 3> arr;
arr.SetAt(0, 1);
arr.SetAt(1, 2);
arr.SetAt(2, 3);
for(auto&& e : arr)
{
  std::cout << e << '\n';
}

The steps necessary to make all this possible are described in detail in the following section.

How to do it...

To enable a custom type to be used in range-based for loops, you need to do the following:

  • Create mutable and constant iterators for the type, which must implement the following operators:
    • operator++ (both the prefix and the postfix version) for incrementing the iterator
    • operator* for dereferencing the iterator and accessing the actual element being pointed to by the iterator
    • operator!= for comparing it with another iterator for inequality
  • Provide free begin() and end() functions for the type.

Given the earlier example of a simple range, we need to provide the following:

  1. The following minimal implementation of an iterator class:
    template <typename T, typename C, size_t const Size>
    class dummy_array_iterator_type
    {
    public:
      dummy_array_iterator_type(C& collection,
                                size_t const index) :
      index(index), collection(collection)
      { }
      bool operator!= (dummy_array_iterator_type const & other) const
      {
        return index != other.index;
      }
      T const & operator* () const
      {
        return collection.GetAt(index);
      }
      dummy_array_iterator_type& operator++()
      {
        ++index;
        return *this;
      }
      dummy_array_iterator_type operator++(int)
      {
        auto temp = *this;
        ++*temp;
        return temp;
      }
    private:
      size_t   index;
      C&       collection;
    };
    
  2. Alias templates for mutable and constant iterators:
    template <typename T, size_t const Size>
    using dummy_array_iterator =
      dummy_array_iterator_type<
        T, dummy_array<T, Size>, Size>;
    template <typename T, size_t const Size>
    using dummy_array_const_iterator =
      dummy_array_iterator_type<
        T, dummy_array<T, Size> const, Size>;
    
  3. Free begin() and end() functions that return the corresponding begin and end iterators, with overloads for both alias templates:
    template <typename T, size_t const Size>
    inline dummy_array_iterator<T, Size> begin(
      dummy_array<T, Size>& collection)
    {
      return dummy_array_iterator<T, Size>(collection, 0);
    }
    template <typename T, size_t const Size>
    inline dummy_array_iterator<T, Size> end(
      dummy_array<T, Size>& collection)
    {
      return dummy_array_iterator<T, Size>(
        collection, collection.GetSize());
    }
    template <typename T, size_t const Size>
    inline dummy_array_const_iterator<T, Size> begin(
      dummy_array<T, Size> const & collection)
    {
      return dummy_array_const_iterator<T, Size>(
        collection, 0);
    }
    template <typename T, size_t const Size>
    inline dummy_array_const_iterator<T, Size> end(
      dummy_array<T, Size> const & collection)
    {
      return dummy_array_const_iterator<T, Size>(
        collection, collection.GetSize());
    }
    

How it works...

Having this implementation available, the range-based for loop shown earlier compiles and executes as expected. When performing argument-dependent lookup, the compiler will identify the two begin() and end() functions that we wrote (which take a reference to a dummy_array) and therefore the code it generates becomes valid.

In the preceding example, we have defined one iterator class template and two alias templates, called dummy_array_iterator and dummy_array_const_iterator. The begin() and end() functions both have two overloads for these two types of iterators.

This is necessary so that the container we have considered can be used in range-based for loops with both constant and non-constant instances:

template <typename T, const size_t Size>
void print_dummy_array(dummy_array<T, Size> const & arr)
{
  for (auto && e : arr)
  {
    std::cout << e << '\n';
  }
}

A possible alternative to enable range-based for loops for the simple range class we considered for this recipe is to provide the member begin() and end() functions. In general, that will make sense only if you own and can modify the source code. On the other hand, the solution shown in this recipe works in all cases and should be preferred to other alternatives.

See also

  • Creating type aliases and alias templates to learn about aliases for types
  • Iterating over collections with the ranges library in Chapter 12, C++20 Core Features, to learn about the fundamentals of the C++20 ranges library
 

Using explicit constructors and conversion operators to avoid implicit conversion

Before C++11, a constructor with a single parameter was considered a converting constructor (because it takes a value of another type and creates a new instance of the type out of it). With C++11, every constructor without the explicit specifier is considered a converting constructor. Such a constructor defines an implicit conversion from the type or types of its arguments to the type of the class. Classes can also define converting operators that convert the type of the class to another specified type. All of these are useful in some cases but can create problems in other cases. In this recipe, we will learn how to use explicit constructors and conversion operators.

Getting ready

For this recipe, you need to be familiar with converting constructors and converting operators. In this recipe, you will learn how to write explicit constructors and conversion operators to avoid implicit conversions to and from a type. The use of explicit constructors and conversion operators (called user-defined conversion functions) enables the compiler to yield errors—which, in some cases, are coding errors—and allow developers to spot those errors quickly and fix them.

How to do it...

To declare explicit constructors and explicit conversion operators (regardless of whether they are functions or function templates), use the explicit specifier in the declaration.

The following example shows both an explicit constructor and an explicit converting operator:

struct handle_t
{
  explicit handle_t(int const h) : handle(h) {}
  explicit operator bool() const { return handle != 0; };
private:
  int handle;
};

How it works...

To understand why explicit constructors are necessary and how they work, we will first look at converting constructors. The following class, foo, has three constructors: a default constructor (without parameters), a constructor that takes an int, and a constructor that takes two parameters, an int and a double. They don't do anything except print a message. As of C++11, these are all considered converting constructors. The class also has a conversion operator that converts a value of the foo type to a bool:

struct foo
{
  foo()
  { std::cout << "foo" << '\n'; }
  foo(int const a)
  { std::cout << "foo(a)" << '\n'; }
  foo(int const a, double const b)
  { std::cout << "foo(a, b)" << '\n'; }
  operator bool() const { return true; }
};

Based on this, the following definitions of objects are possible (note that the comments represent the console's output):

foo f1;              // foo()
foo f2 {};           // foo()
foo f3(1);           // foo(a)
foo f4 = 1;          // foo(a)
foo f5 { 1 };        // foo(a)
foo f6 = { 1 };      // foo(a)
foo f7(1, 2.0);      // foo(a, b)
foo f8 { 1, 2.0 };   // foo(a, b)
foo f9 = { 1, 2.0 }; // foo(a, b)

The variables f1 and f2 invoke the default constructor. f3, f4, f5, and f6 invoke the constructor that takes an int. Note that all the definitions of these objects are equivalent, even if they look different (f3 is initialized using the functional form, f4 and f6 are copy initialized, and f5 is directly initialized using brace-init-list). Similarly, f7, f8, and f9 invoke the constructor with two parameters.

In this case, f5 and f6 will print foo(l), while f8 and f9 will generate compiler errors because all the elements of the initializer list should be integers.

It may be important to note that if foo defines a constructor that takes an std::initializer_list, then all the initializations using {} would resolve to that constructor:

foo(std::initializer_list<int> l)
{ std::cout << "foo(l)" << '\n'; }

These may all look right, but the implicit conversion constructors enable scenarios where the implicit conversion may not be what we wanted. First, let's look at some correct examples:

void bar(foo const f)
{
}
bar({});             // foo()
bar(1);              // foo(a)
bar({ 1, 2.0 });     // foo(a, b)

The conversion operator to bool from the foo class also enables us to use foo objects where Boolean values are expected. Here is an example:

bool flag = f1;                // OK, expect bool conversion
if(f2) { /* do something */ }  // OK, expect bool conversion
std::cout << f3 + f4 << '\n';  // wrong, expect foo addition
if(f5 == f6) { /* do more */ } // wrong, expect comparing foos

The first two are examples where foo is expected to be used as a Boolean. However, the last two with addition and test for equality are probably incorrect, as we most likely expect to add foo objects and test foo objects for equality, not the Booleans they implicitly convert to.

Perhaps a more realistic example to understand where problems could arise would be to consider a string buffer implementation. This would be a class that contains an internal buffer of characters.

This class provides several conversion constructors: a default constructor, a constructor that takes a size_t parameter representing the size of the buffer to preallocate, and a constructor that takes a pointer to char that should be used to allocate and initialize the internal buffer. Succinctly, the implementation of the string buffer that we use for this exemplification looks like this:

class string_buffer
{
public:
  string_buffer() {}
  string_buffer(size_t const size) {}
  string_buffer(char const * const ptr) {}
  size_t size() const { return ...; }
  operator bool() const { return ...; }
  operator char * const () const { return ...; }
};

Based on this definition, we could construct the following objects:

std::shared_ptr<char> str;
string_buffer b1;            // calls string_buffer()
string_buffer b2(20);        // calls string_buffer(size_t const)
string_buffer b3(str.get()); // calls string_buffer(char const*)

The object b1 is created using the default constructor and thus has an empty buffer; b2 is initialized using the constructor with a single parameter where the value of the parameter represents the size in terms of the characters of the internal buffer; and b3 is initialized with an existing buffer, which is used to define the size of the internal buffer and copy its value into the internal buffer. However, the same definition also enables the following object definitions:

enum ItemSizes {DefaultHeight, Large, MaxSize};
string_buffer b4 = 'a';
string_buffer b5 = MaxSize;

In this case, b4 is initialized with a char. Since an implicit conversion to size_t exists, the constructor with a single parameter will be called. The intention here is not necessarily clear; perhaps it should have been "a" instead of 'a', in which case the third constructor would have been called.

However, b5 is most likely an error, because MaxSize is an enumerator representing an ItemSizes and should have nothing to do with a string buffer size. These erroneous situations are not flagged by the compiler in any way. The implicit conversion of unscoped enums to int is a good argument for preferring to use scoped enums (declared with enum class), which do not have this implicit conversion. If ItemSizes was a scoped enum, the situation described here would not appear.

When using the explicit specifier in the declaration of a constructor, that constructor becomes an explicit constructor and no longer allows implicit constructions of objects of a class type. To exemplify this, we will slightly change the string_buffer class to declare all constructors as explicit:

class string_buffer
{
public:
  explicit string_buffer() {}
  explicit string_buffer(size_t const size) {}
  explicit string_buffer(char const * const ptr) {}
  explicit operator bool() const { return ...; }
  explicit operator char * const () const { return ...; }
};

The change here is minimal, but the definitions of b4 and b5 in the earlier example no longer work and are incorrect. This is because the implicit conversions from char or int to size_t are no longer available during overload resolution to figure out what constructor should be called. The result is compiler errors for both b4 and b5. Note that b1, b2, and b3 are still valid definitions, even if the constructors are explicit.

The only way to fix the problem, in this case, is to provide an explicit cast from char or int to string_buffer:

string_buffer b4 = string_buffer('a');
string_buffer b5 = static_cast<string_buffer>(MaxSize);
string_buffer b6 = string_buffer{ "a" };

With explicit constructors, the compiler is able to immediately flag erroneous situations and developers can react accordingly, either fixing the initialization with a correct value or providing an explicit cast.

This is only the case when initialization is done with copy initialization and not when using functional or universal initialization.

The following definitions are still possible (and wrong) with explicit constructors:

string_buffer b7{ 'a' };
string_buffer b8('a');

Similar to constructors, conversion operators can be declared explicit (as shown earlier). In this case, the implicit conversions from the object type to the type specified by the conversion operator are no longer possible and require an explicit cast. Considering b1 and b2, which are the string_buffer objects we defined earlier, the following are no longer possible with an explicit conversion operator bool:

std::cout << b4 + b5 << '\n'; // error
if(b4 == b5) {}               // error

Instead, they require explicit conversion to bool:

std::cout << static_cast<bool>(b4) + static_cast<bool>(b5);
if(static_cast<bool>(b4) == static_cast<bool>(b5)) {}

The addition of two bool values does not make much sense. The preceding example is intended only to show how an explicit cast is required in order to make the statement compile. The error issued by the compiler when there is no explicit static cast, should help you figure out that the expression itself is wrong and something else was probably intended.

See also

  • Understanding uniform initialization to see how brace-initialization works
 

Using unnamed namespaces instead of static globals

The larger a program, the greater the chances are you could run into name collisions when your program is linked to multiple translation units. Functions or variables that are declared in a source file, and are supposed to be local to the translation unit, may collide with other similar functions or variables declared in another translation unit.

That is because all the symbols that are not declared static have external linkage and their names must be unique throughout the program. The typical C solution for this problem is to declare those symbols as static, changing their linkage from external to internal and therefore making them local to a translation unit. An alternative is to prefix the names with the name of the module or library they belong to. In this recipe, we will look at the C++ solution for this problem.

Getting ready

In this recipe, we will discuss concepts such as global functions and static functions, as well as variables, namespaces, and translation units. We expect that you have a basic understanding of these concepts. Apart from these, it is required that you understand the difference between internal and external linkage; this is key for this recipe.

How to do it...

When you are in a situation where you need to declare global symbols as static to avoid linkage problems, you should prefer to use unnamed namespaces:

  1. Declare a namespace without a name in your source file.
  2. Put the definition of the global function or variable in the unnamed namespace without making them static.

The following example shows two functions called print() in two different translation units; each of them is defined in an unnamed namespace:

// file1.cpp
namespace
{
  void print(std::string message)
  {
    std::cout << "[file1] " << message << '\n';
  }
}
void file1_run()
{
  print("run");
}
// file2.cpp
namespace
{
  void print(std::string message)
  {
    std::cout << "[file2] " << message << '\n';
  }
}
void file2_run()
{
  print("run");
}

How it works...

When a function is declared in a translation unit, it has external linkage. This means two functions with the same name from two different translation units would generate a linkage error because it is not possible to have two symbols with the same name. The way this problem is solved in C, and by some in C++ also, is to declare the function or variable as static and change its linkage from external to internal. In this case, its name is no longer exported outside the translation unit, and the linkage problem is avoided.

The proper solution in C++ is to use unnamed namespaces. When you define a namespace like the ones shown previously, the compiler transforms it into the following:

// file1.cpp
namespace _unique_name_ {}
using namespace _unique_name_;
namespace _unique_name_
{
  void print(std::string message)
  {
    std::cout << "[file1] " << message << '\n';
  }
}
void file1_run()
{
  print("run");
}

First of all, it declares a namespace with a unique name (what the name is and how it generates that name is a compiler implementation detail and should not be a concern). At this point, the namespace is empty, and the purpose of this line is to basically establish the namespace. Second, a using directive brings everything from the _unique_name_ namespace into the current namespace. Third, the namespace, with the compiler-generated name, is defined as it was in the original source code (when it had no name).

By defining the translation unit local print() functions in an unnamed namespace, they have local visibility only, yet their external linkage no longer produces linkage errors since they now have external unique names.

Unnamed namespaces also work in a perhaps more obscure situation involving templates. Prior to C++11, template non-type arguments could not be names with internal linkage, so using static variables was not possible. On the other hand, symbols in an unnamed namespace have external linkage and could be used as template arguments. Although this linkage restriction for template non-type arguments was lifted in C++11, it is still present in the latest version of the VC++ compiler. This problem is shown in the following example:

template <int const& Size>
class test {};
static int Size1 = 10;
namespace
{
  int Size2 = 10;
}
test<Size1> t1;
test<Size2> t2;

In this snippet, the declaration of the t1 variable produces a compiler error because the non-type argument expression, Size1, has internal linkage. On the other hand, the declaration of the t2 variable is correct because Size2 has external linkage. (Note that compiling this snippet with Clang and GCC does not produce an error.)

See also

  • Using inline namespaces for symbol versioning to learn how to version your source code using inline namespaces and conditional compilation
 

Using inline namespaces for symbol versioning

The C++11 standard has introduced a new type of namespace called inline namespaces, which are basically a mechanism that makes declarations from a nested namespace look and act like they were part of the surrounding namespace. Inline namespaces are declared using the inline keyword in the namespace declaration (unnamed namespaces can also be inlined). This is a helpful feature for library versioning, and in this recipe, we will learn how inline namespaces can be used for versioning symbols. From this recipe, you will learn how to version your source code using inline namespaces and conditional compilation.

Getting ready

In this recipe, we will discuss namespaces and nested namespaces, templates and template specializations, and conditional compilation using preprocessor macros. Familiarity with these concepts is required in order to proceed with this recipe.

How to do it...

To provide multiple versions of a library and let the user decide what version to use, do the following:

  • Define the content of the library inside a namespace.
  • Define each version of the library or parts of it inside an inner inline namespace.
  • Use preprocessor macros and #if directives to enable a particular version of the library.

The following example shows a library that has two versions that clients can use:

namespace modernlib
{
  #ifndef LIB_VERSION_2
  inline namespace version_1
  {
    template<typename T>
    int test(T value) { return 1; }
  }
  #endif
  #ifdef LIB_VERSION_2
  inline namespace version_2
  {
    template<typename T>
    int test(T value) { return 2; }
  }
  #endif
}

How it works...

A member of an inline namespace is treated as if it was a member of the surrounding namespace. Such a member can be partially specialized, explicitly instantiated, or explicitly specialized. This is a transitive property, which means that if a namespace, A, contains an inline namespace, B, that contains an inline namespace, C, then the members of C appear as they were members of both B and A and the members of B appear as they were members of A.

To better understand why inline namespaces are helpful, let's consider the case of developing a library that evolves over time from a first version to a second version (and further on). This library defines all its types and functions under a namespace called modernlib. In the first version, this library could look like this:

namespace modernlib
{
  template<typename T>
  int test(T value) { return 1; }
}

A client of the library can make the following call and get back the value 1:

auto x = modernlib::test(42);

However, the client might decide to specialize the template function test() as follows:

struct foo { int a; };
namespace modernlib
{
  template<>
  int test(foo value) { return value.a; }
}
auto y = modernlib::test(foo{ 42 });

In this case, the value of y is no longer 1 but 42 because the user-specialized function gets called.

Everything is working correctly so far, but as a library developer, you decide to create a second version of the library, yet still ship both the first and the second version and let the user control what to use with a macro. In this second version, you provide a new implementation of the test() function that no longer returns 1, but 2. To be able to provide both the first and second implementations, you put them in nested namespaces called version_1 and version_2 and conditionally compile the library using preprocessor macros:

namespace modernlib
{
  namespace version_1
  {
    template<typename T>
    int test(T value) { return 1; }
  }
  #ifndef LIB_VERSION_2
  using namespace version_1;
  #endif
  namespace version_2
  {
    template<typename T>
    int test(T value) { return 2; }
  }
  #ifdef LIB_VERSION_2
  using namespace version_2;
  #endif
}

Suddenly, the client code will break, regardless of whether it uses the first or second version of the library. This is because the test function is now inside a nested namespace, and the specialization for foo is done in the modernlib namespace, when it should actually be done in modernlib::version_1 or modernlib::version_2. This is because the specialization of a template is required to be done in the same namespace where the template was declared.

In this case, the client needs to change the code, like this:

#define LIB_VERSION_2
#include "modernlib.h"
struct foo { int a; };
namespace modernlib
{
  namespace version_2
  {
    template<>
    int test(foo value) { return value.a; }
  }
}

This is a problem because the library leaks implementation details, and the client needs to be aware of those in order to do template specialization. These internal details are hidden with inline namespaces in the manner shown in the How to do it... section of this recipe. With that definition of the modernlib library, the client code with the specialization of the test() function in the modernlib namespace is no longer broken, because either version_1::test() or version_2::test() (depending on what version the client is actually using) acts as being part of the enclosing modernlib namespace when template specialization is done. The details of the implementation are now hidden to the client, who only sees the surrounding namespace, modernlib.

However, you should keep in mind that the namespace std is reserved for the standard and should never be inlined. Also, a namespace should not be defined inline if it was not inline in its first definition.

See also

  • Using unnamed namespaces instead of static globals to explore anonymous namespaces and learn how they help
  • Conditionally compiling your source code in Chapter 4, Preprocessing and Compilation, to learn the various options for performing conditional compilation
 

Using structured bindings to handle multi-return values

Returning multiple values from a function is very common, yet there is no first-class solution in C++ to make it possible in a straightforward way. Developers have to choose between returning multiple values through reference parameters to a function, defining a structure to contain the multiple values, or returning an std::pair or std::tuple. The first two use named variables, which gives them the advantage that they clearly indicate the meaning of the return value, but have the disadvantage that they have to be explicitly defined. std::pair has its members called first and second, while std::tuple has unnamed members that can only be retrieved with a function call, but can be copied to named variables using std::tie(). None of these solutions are ideal.

C++17 extends the semantic use of std::tie() into a first-class core language feature that enables unpacking the values of a tuple into named variables. This feature is called structured bindings.

Getting ready

For this recipe, you should be familiar with the standard utility types std::pair and std::tuple and the utility function std::tie().

How to do it...

To return multiple values from a function using a compiler that supports C++17, you should do the following:

  1. Use an std::tuple for the return type:
    std::tuple<int, std::string, double> find()
    {
      return std::make_tuple(1, "marius", 1234.5);
    }
    
  2. Use structured bindings to unpack the values of the tuple into named objects:
    auto [id, name, score] = find();
    
  3. Use decomposition declaration to bind the returned values to the variables inside an if statement or switch statement:
    if (auto [id, name, score] = find(); score > 1000)
    {
      std::cout << name << '\n';
    }
    

How it works...

Structured bindings are a language feature that works just like std::tie(), except that we don't have to define named variables for each value that needs to be unpacked explicitly with std::tie(). With structured bindings, we define all the named variables in a single definition using the auto specifier so that the compiler can infer the correct type for each variable.

To exemplify this, let's consider the case of inserting items into an std::map. The insert method returns an std::pair containing an iterator for the inserted element or the element that prevented the insertion, and a Boolean indicating whether the insertion was successful or not. The following code is very explicit and the use of second or first->second makes the code harder to read because you need to constantly figure out what they represent:

std::map<int, std::string> m;
auto result = m.insert({ 1, "one" });
std::cout << "inserted = " << result.second << '\n'
          << "value = " << result.first->second << '\n';

The preceding code can be made more readable with the use of std::tie, which unpacks tuples into individual objects (and works with std::pair because std::tuple has a converting assignment from std::pair):

std::map<int, std::string> m;
std::map<int, std::string>::iterator it;
bool inserted;
std::tie(it, inserted) = m.insert({ 1, "one" });
std::cout << "inserted = " << inserted << '\n'
          << "value = " << it->second << '\n';
std::tie(it, inserted) = m.insert({ 1, "two" });
std::cout << "inserted = " << inserted << '\n'
          << "value = " << it->second << '\n';

The code is not necessarily simpler because it requires defining the objects that the pair is unpacked to in advance. Similarly, the more elements the tuple has, the more objects you need to define, but using named objects makes the code easier to read.

C++17 structured bindings elevates unpacking tuple elements into named objects to the rank of a language feature; it does not require the use of std::tie(), and objects are initialized when declared:

std::map<int, std::string> m;
{
  auto [it, inserted] = m.insert({ 1, "one" });
  std::cout << "inserted = " << inserted << '\n'
            << "value = " << it->second << '\n';
}
{
  auto [it, inserted] = m.insert({ 1, "two" });
  std::cout << "inserted = " << inserted << '\n'
            << "value = " << it->second << '\n';
}

The use of multiple blocks in the preceding example is necessary because variables cannot be redeclared in the same block, and structured bindings imply a declaration using the auto specifier. Therefore, if you need to make multiple calls, as in the preceding example, and use structured bindings, you must either use different variable names or multiple blocks. An alternative to that is to avoid structured bindings and use std::tie(), because it can be called multiple times with the same variables, so you only need to declare them once.

In C++17, it is also possible to declare variables in if and switch statements in the form if(init; condition) and switch(init; condition), respectively. This could be combined with structured bindings to produce simpler code. Let's look at an example:

if(auto [it, inserted] = m.insert({ 1, "two" }); inserted)
{ std::cout << it->second << '\n'; }

In the preceding snippet, we attempted to insert a new value into a map. The result of the call is unpacked into two variables, it and inserted, defined in the scope of the if statement in the initialization part. Then, the condition of the if statement is evaluated from the value of the inserted variable.

There's more...

Although we focused on binding names to the elements of tuples, structured bindings can be used in a broader scope because they also support binding to array elements or data members of a class. If you want to bind to the elements of an array, you must provide a name for every element of the array; otherwise, the declaration is ill-formed. The following is an example of binding to array elements:

int arr[] = { 1,2 };
auto [a, b] = arr;
auto& [x, y] = arr;
arr[0] += 10;
arr[1] += 10;
std::cout << arr[0] << ' ' << arr[1] << '\n'; // 11 12
std::cout << a << ' ' << b << '\n';           // 1 2
std::cout << x << ' ' << y << '\n';           // 11 12

In this example, arr is an array with two elements. We first bind a and b to its elements, and then we bind the x and y references to its elements. Changes that are made to the elements of the array are not visible through the variables a and b but are visible through the x and y references, as shown in the comments that print these values to the console. This happens because when we do the first binding, a copy of the array is created and a and b are bound to the elements of the copy.

As we already mentioned, it's also possible to bind to data members of a class. The following restrictions apply:

  • Binding is possible only for non-static members of the class.
  • The class cannot have anonymous union members.
  • The number of identifiers must match the number of non-static members of the class.

The binding of identifiers occurs in the order of the declaration of the data members, which can include bitfields. An example is shown here:

struct foo
{
   int         id;
   std::string name;
};
foo f{ 42, "john" };
auto [i, n] = f;
auto& [ri, rn] = f;
f.id = 43;
std::cout << f.id << ' ' << f.name << '\n';   // 43 john
std::cout << i << ' ' << n << '\n';           // 42 john
std::cout << ri << ' ' << rn << '\n';         // 43 john

Again, changes to the foo object are not visible to the variables i and n but are to ri and rn. This is because each identifier in the structure binding becomes the name of an lvalue that refers to a data member of the class (just like with an array, it refers to an element of the array). However, the reference type of an identifier is the corresponding data member (or array element).

The new C++20 standard has introduced a series of improvements to structure bindings, including the following:

  • Possibility to include the static or thread_local storage-class specifiers in the declaration of the structure bindings.
  • Allow the use of the [[maybe_unused]] attribute for the declaration of a structured binding. Some compilers, such as Clang and GCC, already supported this feature.
  • Allow us to capture structure binding identifiers in lambdas. All identifiers, including those bound to bitfields, can be captured by value. On the other hand, all identifiers except for those bound to bitfields can also be captured by reference.

These changes enable us to write the following:

foo f{ 42, "john" };
auto [i, n] = f;
auto l1 = [i] {std::cout << i; };
auto l2 = [=] {std::cout << i; };
auto l3 = [&i] {std::cout << i; };
auto l4 = [&] {std::cout << i; };

These examples show the various ways structured bindings can be captured in lambdas in C++20.

See also

  • Using auto whenever possible to understand how automatic type deduction works in C++
  • Using lambdas with standard algorithms in Chapter 3, Exploring Functions to learn how lambdas can be used with standard library general-purpose algorithms
  • Providing metadata to the compiler with attributes in Chapter 4, Preprocessing and Compilation, to learn about providing hints to the compiler with the use of standard attributes
 

Simplifying code with class template argument deduction

Templates are ubiquitous in C++, but having to specify template arguments all the time can be annoying. There are cases when the compiler can actually infer the template arguments from the context. This feature, available in C++17, is called class template argument deduction and enables the compiler to deduce the missing template arguments from the type of the initializer. In this recipe, we will learn how to take advantage of this feature.

How to do it...

In C++17, you can skip specifying template arguments and let the compiler deduce them in the following cases:

  • When you declare a variable or a variable template and initialize it:
    std::pair   p{ 42, "demo" };  // deduces std::pair<int, char const*>
    std::vector v{ 1, 2 };        // deduces std::vector<int>
    std::less   l;                // deduces std::less<void>
    
  • When you create an object using a new expression:
    template <class T>
    struct foo
    {
       foo(T v) :data(v) {}
    private:
       T data;
    };
    auto f = new foo(42);
    
  • When you perform function-like cast expressions:
    std::mutex mx;
    // deduces std::lock_guard<std::mutex>
    auto lock = std::lock_guard(mx);
    std::vector<int> v;
    // deduces std::back_insert_iterator<std::vector<int>>
    std::fill_n(std::back_insert_iterator(v), 5, 42);
    

How it works...

Prior to C++17, you had to specify all the template arguments when initializing variables, because all of them must be known in order to instantiate the class template, such as in the following example:

std::pair<int, char const*> p{ 42, "demo" };
std::vector<int>            v{ 1, 2 };
foo<int>                    f{ 42 };

The problem of explicitly specifying template arguments could have been avoided with a function template, such as std::make_pair(), which benefits from function template argument deduction, and allows us to write code such as the following:

auto p = std::make_pair(42, "demo");

In the case of the foo class template shown here, we can write the following make_foo() function template to enable the same behavior:

template <typename T>
constexpr foo<T> make_foo(T&& value)
{
   return foo{ value };
}
auto f = make_foo(42);

In C++17, this is no longer necessary in the cases listed in the How it works... section. Let's take the following declaration as an example:

std::pair p{ 42, "demo" };

In this context, std::pair is not a type, but acts as a placeholder for a type that activates class template argument deduction. When the compiler encounters it during the declaration of a variable with initialization or a function-style cast, it builds a set of deduction guides. These deduction guides are fictional constructors of a hypothetical class type. As a user, you can complement this set with user-defined deduction rules. This set is used to perform template argument deduction and overload resolution.

In the case of std::pair, the compiler will build a set of deduction guides that includes the following fictional function templates (but not only these):

template <class T1, class T2>
std::pair<T1, T2> F();
template <class T1, class T2>
std::pair<T1, T2> F(T1 const& x, T2 const& y);
template <class T1, class T2, class U1, class U2>
std::pair<T1, T2> F(U1&& x, U2&& y);

These compiler-generated deduction guides are created from the constructors of the class template, and if none are present, then a deduction guide is created for a hypothetical default constructor. In addition, in all cases, a deduction guide for a hypothetical copy constructor is always created.

The user-defined deduction guides are function signatures with trailing return type and without the auto keyword (since they represent hypothetical constructors that don't have a return value). They must be defined in the namespace of the class template they apply to.

To understand how this works, let's consider the same example with the std::pair object:

std::pair p{ 42, "demo" };

The type that the compiler is deducing is std::pair<int, char const*>. If we want to instruct the compiler to deduce std::string instead of char const*, then we need several user-defined deduction rules, as shown here:

namespace std {
   template <class T>
   pair(T&&, char const*)->pair<T, std::string>;
   template <class T>
   pair(char const*, T&&)->pair<std::string, T>;
   pair(char const*, char const*)->pair<std::string, std::string>;
}

These will enable us to perform the following declarations, where the type of the string "demo" is always deduced to be std::string:

std::pair  p1{ 42, "demo" };    // std::pair<int, std::string>
std::pair  p2{ "demo", 42 };    // std::pair<std::string, int>
std::pair  p3{ "42", "demo" };  // std::pair<std::string, std::string>

As you can see from this example, deduction guides do not have to be function templates.

It is important to note that class template argument deduction does not occur if the template argument list is present, regardless of the number of specified arguments. Examples of this are shown here:

std::pair<>    p1 { 42, "demo" };
std::pair<int> p2 { 42, "demo" };

Because both these declarations specify a template argument list, they are invalid and produce compiler errors.

See also

  • Understanding uniform initialization to see how brace-initialization works

About the Author

  • Marius Bancila

    Marius Bancila is a software engineer with almost two decades of experience in developing solutions for the industrial and financial sectors. He is the author of The Modern C++ Challenge and co-author of Learn C# Programming. He works as a software architect and is focused on Microsoft technologies, mainly developing desktop applications with C++ and C#. He is passionate about sharing his technical expertise with others and, for that reason, he has been recognized as a Microsoft MVP for C++ and later developer technologies since 2006. Marius lives in Romania and is active in various online communities.

    Browse publications by this author

Latest Reviews

(3 reviews total)
Just has been started reading, but love author's style and coding samples.
A very useful book which is easy to follow
For Good Information in the Book

Recommended For You

Modern C++ Programming Cookbook - Second Edition
Unlock this book and the full library for FREE
Start free trial