Modern C++ Programming Cookbook

4.1 (20 reviews total)
By Marius Bancila
  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Learning Modern Core Language Features

About this book

C++ is one of the most widely used programming languages. Fast, efficient, and flexible, it is used to solve many problems. The latest versions of C++ have seen programmers change the way they code, giving up on the old-fashioned C-style programming and adopting modern C++ instead.

Beginning with the modern language features, each recipe addresses a specific problem, with a discussion that explains the solution and offers insight into how it works. You will learn major concepts about the core programming language as well as common tasks faced while building a wide variety of software. You will learn about concepts such as concurrency, performance, meta-programming, lambda expressions, regular expressions, testing, and many more in the form of recipes. These recipes will ensure you can make your applications robust and fast.

By the end of the book, you will understand the newer aspects of C++11/14/17 and will be able to overcome tasks that are time-consuming or would break your stride while developing.

Publication date:
May 2017
Publisher
Packt
Pages
590
ISBN
9781786465184

 

Chapter 1. Learning Modern Core Language Features

The recipes included in this chapter are as follows:

  • Using auto whenever possible
  • Creating type aliases and alias templates
  • Understanding uniform initialization
  • Understanding the various forms of non-static member initialization
  • Controlling and querying object alignment
  • Using scoped enumerations
  • Using override and final for virtual methods
  • Using range-based for loops to iterate on a range
  • Enabling range-based for loops for custom types
  • Using explicit constructors and conversion operators to avoid implicit conversion
  • Using unnamed namespaces instead of static globals
  • Using inline namespaces for symbol versioning
  • Using structured bindings to handle multi-return values
 

Introduction


The C++ language has gone through a major transformation in the past decade with the development and release of C++11 and then later with its newer versions C++14 and C++17. These new standards have introduced new concepts, simplified or extended existing syntax and semantics, and overall transformed the way we write code. C++11 looks like a new language, and code written using the new standards is called modern C++ code.

 

Using auto whenever possible


Automatic type deduction is one of the most important and widely used features in modern C++. The new C++ standards have made it possible to use auto as a placeholder for types in various contexts and let the compiler deduce the actual type. In C++11, auto can be used for declaring local variables and for the return type of a function with a trailing return type. In C++14, auto can be used for the return type of a function without specifying a trailing type and for parameter declarations in lambda expressions. Future standard versions are likely to expand the use of auto to even more cases. The use of auto in these contexts has several important benefits. Developers should be aware of them, and prefer auto whenever possible. An actual term was coined for this by Andrei Alexandrescu and promoted by Herb Sutter--almost always auto (AAA).

How to do it...

Consider using auto as a placeholder for the actual type in the following situations:

  • To declare local variables with the form auto name = expression when you do not want to commit to a specific type:
        auto i = 42;          // int 
        auto d = 42.5;        // double 
        auto s = "text";      // char const * 
        auto v = { 1, 2, 3 }; // std::initializer_list<int> 
  • To declare local variables with the auto name = type-id { expression } form when you need to commit to a specific type:
        auto b  = new char[10]{ 0 };            // char* 
        auto s1 = std::string {"text"};         // std::string
        auto v1 = std::vector<int> { 1, 2, 3 }; // std::vector<int>
        auto p  = std::make_shared<int>(42);    // std::shared_ptr<int>
  • To declare named lambda functions, with the form auto name = lambda-expression, unless the lambda needs to be passed or return to a function:
        auto upper = [](char const c) {return toupper(c); };
  • To declare lambda parameters and return values:
        auto add = [](auto const a, auto const b) {return a + b;};

 

  • To declare function return type when you don't want to commit to a specific type:
        template <typename F, typename T> 
        auto apply(F&& f, T value) 
        { 
          return f(value); 
        }

How it works...

The auto specifier is basically a placeholder for an actual type. When using auto, the compiler deduces the actual type from the following instances:

  • From the type of the expression used to initialize a variable, when auto is used to declare variables.
  • From the trailing return type or the type of the return expression of a function, when auto is used as a placeholder for the return type of a function.

In some cases, it is necessary to commit to a specific type. For instance, in the preceding example, the compiler deduces the type of s to be char const *. If the intention was to have a std::string, then the type must be specified explicitly. Similarly, the type of v was deduced as std::initializer_list<int>. However, the intention could be to have a std::vector<int>. In such cases, the type must be specified explicitly on the right side of the assignment.

There are some important benefits of using the auto specifier instead of actual types; the following is a list of, perhaps, the most important ones:

  • It is not possible to leave a variable uninitialized. This is a common mistake that developers do when declaring variables specifying the actual type, but it is not possible with auto that requires an initialization of the variable in order to deduce the type.
  • Using auto ensures that you always use the correct type and that implicit conversion will not occur. Consider the following example where we retrieve the size of a vector to a local variable. In the first case, the type of the variable is int, though the size() method returns size_t. That means an implicit conversion from size_t to int will occur. However, using auto for the type will deduce the correct type, that is, size_t:
        auto v = std::vector<int>{ 1, 2, 3 }; 
        int size1 = v.size();       
        // implicit conversion, possible loss of data 
        auto size2 = v.size(); 
        auto size3 = int{ v.size() };  // error, narrowing conversion
  • Using auto promotes good object-oriented practices, such as preferring interfaces over implementations. The lesser the number of types specified the more generic the code is and more open to future changes, which is a fundamental principle of object-oriented programming.
  • It means less typing and less concern for actual types that we don't care about anyways. It is very often that even though we explicitly specify the type, we don't actually care about it. A very common case is with iterators, but one can think of many more. When you want to iterate over a range, you don't care about the actual type of the iterator. You are only interested in the iterator itself; so, using auto saves time used for typing possibly long names and helps you focus on actual code and not type names. In the following example, in the first for loop, we explicitly use the type of the iterator. It is a lot of text to type, the long statements can actually make the code less readable, and you also need to know the type name that you actually don't care about. The second loop with the auto specifier looks simpler and saves you from typing and caring about actual types.
        std::map<int, std::string> m; 
        for (std::map<int,std::string>::const_iterator it = m.cbegin();
          it != m.cend(); ++it) 
        { /*...*/ } 

        for (auto it = m.cbegin(); it != m.cend(); ++it)
        { /*...*/ }
  • Declaring variables with auto provides a consistent coding style with the type always in the right-hand side. If you allocate objects dynamically, you need to write the type both on the left and right side of the assignment, for example, int p = new int(42). With auto, the type is specified only once on the right side.

However, there are some gotchas when using auto:

  • The auto specifier is only a placeholder for the type, not for the const/volatile and references specifiers. If you need a const/volatile and/or reference type, then you need to specify them explicitly. In the following example, foo.get() returns a reference to int; when variable x is initialized from the return value, the type deduced by the compiler is int, and not int&. Therefore, any change to x will not propagate to foo.x_. In order to do so, one should use auto&:
        class foo { 
          int x_; 
        public: 
          foo(int const x = 0) :x_{ x } {} 
          int& get() { return x_; } 
        }; 

        foo f(42); 
        auto x = f.get(); 
        x = 100; 
        std::cout << f.get() << std::endl; // prints 42
  • It is not possible to use auto for types that are not moveable:
        auto ai = std::atomic<int>(42); // error
  • It is not possible to use auto for multi-word types, such as long long, long double, or struct foo. However, in the first case, the possible workarounds are to use literals or type aliases; as for the second, using struct/class in that form is only supported in C++ for C compatibility and should be avoided anyways:
        auto l1 = long long{ 42 }; // error 
        auto l2 = llong{ 42 };     // OK 
        auto l3 = 42LL;            // OK
  • If you use the auto specifier but still need to know the type, you can do so in any IDE by putting the cursor over a variable, for instance. If you leave the IDE, however, that is not possible anymore, and the only way to know the actual type is to deduce it yourself from the initialization expression, which could probably mean searching through the code for function return types.

The auto can be used to specify the return type from a function. In C++11, this requires a trailing return type in the function declaration. In C++14, this has been relaxed, and the type of the return value is deduced by the compiler from the return expression. If there are multiple return values they should have the same type:

    // C++11 
    auto func1(int const i) -> int 
    { return 2*i; } 

    // C++14 
    auto func2(int const i) 
    { return 2*i; }

As mentioned earlier, auto does not retain const/volatile and reference qualifiers. This leads to problems with auto as a placeholder for the return type from a function. To explain this, let us consider the preceding example with foo.get(). This time we have a wrapper function called proxy_get() that takes a reference to a foo, calls get(), and returns the value returned by get(), which is an int&. However, the compiler will deduce the return type of proxy_get() as being int, not int&. Trying to assign that value to an int& fails with an error:

    class foo 
    { 
      int x_; 
    public: 
      foo(int const x = 0) :x_{ x } {} 
      int& get() { return x_; } 
    }; 

    auto proxy_get(foo& f) { return f.get(); } 

    auto f = foo{ 42 }; 
    auto& x = proxy_get(f); // cannot convert from 'int' to 'int &'

To fix this, we need to actually return auto&. However, this is a problem with templates and perfect forwarding the return type without knowing whether that is a value or a reference. The solution to this problem in C++14 is decltype(auto) that will correctly deduce the type:

    decltype(auto) proxy_get(foo& f) { return f.get(); } 
    auto f = foo{ 42 }; 
    decltype(auto) x = proxy_get(f);

The last important case where auto can be used is with lambdas. As of C++14, both lambda return type and lambda parameter types can be auto. Such a lambda is called a generic lambda because the closure type defined by the lambda has a templated call operator. The following shows a generic lambda that takes two auto parameters and returns the result of applying operator+ on the actual types:

    auto ladd = [] (auto const a, auto const b) { return a + b; }; 
    struct 
    { 
       template<typename T, typename U> 
       auto operator () (T const a, U const b) const { return a+b; } 
    } L;

This lambda can be used to add anything for which the operator+ is defined. In the following example, we use the lambda to add two integers and to concatenate to std::string objects (using the C++14 user-defined literal operator ""s):

    auto i = ladd(40, 2);            // 42 
    auto s = ladd("forty"s, "two"s); // "fortytwo"s

See also

  • Creating type aliases and alias templates
  • Understanding uniform initialization
 

Creating type aliases and alias templates


In C++, it is possible to create synonyms that can be used instead of a type name. This is achieved by creating a typedef declaration. This is useful in several cases, such as creating shorter or more meaningful names for a type or names for function pointers. However, typedef declarations cannot be used with templates to create template type aliases. An std::vector<T>, for instance, is not a type (std::vector<int> is a type), but a sort of family of all types that can be created when the type placeholder T is replaced with an actual type.

In C++11, a type alias is a name for another already declared type, and an alias template is a name for another already declared template. Both of these types of aliases are introduced with a new using syntax.

How to do it...

  • Create type aliases with the form using identifier = type-id as in the following examples:
        using byte    = unsigned char; 
        using pbyte   = unsigned char *; 
        using array_t = int[10]; 
        using fn      = void(byte, double); 

        void func(byte b, double d) { /*...*/ } 

        byte b {42}; 
        pbyte pb = new byte[10] {0}; 
        array_t a{0,1,2,3,4,5,6,7,8,9}; 
        fn* f = func;
  • Create alias templates with the form template<template-params-list> identifier = type-id as in the following examples:
        template <class T> 
        class custom_allocator { /* ... */}; 

        template <typename T> 
        using vec_t = std::vector<T, custom_allocator<T>>; 

        vec_t<int>           vi; 
        vec_t<std::string>   vs; 

For consistency and readability, you should do the following:

  • Not mix typedef and using declarations for creating aliases.
  • Use the using syntax to create names of function pointer types.

How it works...

A typedef declaration introduces a synonym (or an alias in other words) for a type. It does not introduce another type (like a class, struct, union, or enum declaration). Type names introduced with a typedef declaration follow the same hiding rules as identifier names. They can also be redeclared, but only to refer to the same type (therefore, you can have valid multiple typedef declarations that introduce the same type name synonym in a translation unit as long as it is a synonym for the same type). The following are typical examples of typedef declarations:

    typedef unsigned char   byte; 
    typedef unsigned char * pbyte; 
    typedef int             array_t[10]; 
    typedef void(*fn)(byte, double); 

    template<typename T> 
    class foo { 
      typedef T value_type; 
    }; 

    typedef std::vector<int> vint_t;

A type alias declaration is equivalent to a typedef declaration. It can appear in a block scope, class scope, or namespace scope. According to C++11 paragraph 7.1.3.2:

A typedef-name can also be introduced by an alias-declaration. The identifier following the using keyword becomes a typedef-name and the optional attribute-specifier-seq following the identifier appertains to that typedef-name. It has the same semantics as if it were introduced by the typedef specifier. In particular, it does not define a new type and it shall not appear in the type-id.

An alias-declaration is, however, more readable and more clear about the actual type that is aliased when it comes to creating aliases for array types and function pointer types. In the examples from the How to do it... section, it is easily understandable that array_t is a name for the type array of 10 integers, and fn is a name for a function type that takes two parameters of type byte and double and returns void. That is also consistent with the syntax for declaring std::function objects (for example, std::function<void(byte, double)> f).

The driving purpose of the new syntax is to define alias templates. These are templates which, when specialized, are equivalent to the result of substituting the template arguments of the alias template for the template parameters in the type-id.

It is important to take note of the following things:

  • Alias templates cannot be partially or explicitly specialized.
  • Alias templates are never deduced by template argument deduction when deducing a template parameter.
  • The type produced when specializing an alias template is not allowed to directly or indirectly make use of its own type.
 

Understanding uniform initialization


Brace-initialization is a uniform method for initializing data in C++11. For this reason, it is also called uniform initialization. It is arguably one of the most important features from C++11 that developers should understand and use. It removes previous distinctions between initializing fundamental types, aggregate and non-aggregate types, and arrays and standard containers.

Getting ready

For continuing with this recipe, you need to be familiar with direct initialization that initializes an object from an explicit set of constructor arguments and copy initialization that initializes an object from another object. The following is a simple example of both types of initialization, but for further details, you should see additional resources:

    std::string s1("test");   // direct initialization 
    std::string s2 = "test";  // copy initialization

How to do it...

To uniformly initialize objects regardless of their type, use the brace-initialization form {} that can be used for both direct initialization and copy initialization. When used with brace initialization, these are called direct list and copy list initialization.

    T object {other};   // direct list initialization 
    T object = {other}; // copy list initialization

Examples of uniform initialization are as follows:

  • Standard containers:
        std::vector<int> v { 1, 2, 3 };
        std::map<int, std::string> m { {1, "one"}, { 2, "two" }};
  • Dynamically allocated arrays:
        int* arr2 = new int[3]{ 1, 2, 3 };    
  • Arrays:
        int arr1[3] { 1, 2, 3 }; 
  • Built-in types:
        int i { 42 };
        double d { 1.2 };    
  • User-defined types:
        class foo
        {
          int a_;
          double b_;
        public:
          foo():a_(0), b_(0) {}
          foo(int a, double b = 0.0):a_(a), b_(b) {}
        };

        foo f1{}; 
        foo f2{ 42, 1.2 }; 
        foo f3{ 42 };
  • User-defined POD types:
        struct bar { int a_; double b_;};
        bar b{ 42, 1.2 };

How it works...

Before C++11 objects required different types of initialization based on their type:

  • Fundamental types could be initialized using assignment:
        int a = 42; 
        double b = 1.2;
  • Class objects could also be initialized using assignment from a single value if they had a conversion constructor (prior to C++11, a constructor with a single parameter was called a conversion constructor):
        class foo 
        { 
          int a_; 
        public: 
          foo(int a):a_(a) {} 
        }; 
        foo f1 = 42;
  • Non-aggregate classes could be initialized with parentheses (the functional form) when arguments were provided and only without any parentheses when default initialization was performed (call to the default constructor). In the next example, foo is the structure defined in the How to do it... section:
        foo f1;           // default initialization 
        foo f2(42, 1.2); 
        foo f3(42); 
        foo f4();         // function declaration

 

  • Aggregate and POD types could be initialized with brace-initialization. In the next example, bar is the structure defined in the How to do it... section:
        bar b = {42, 1.2}; 
        int a[] = {1, 2, 3, 4, 5};

Apart from the different methods of initializing the data, there are also some limitations. For instance, the only way to initialize a standard container was to first declare an object and then insert elements into it; vector was an exception because it is possible to assign values from an array that can be prior initialized using aggregate initialization. On the other hand, however, dynamically allocated aggregates could not be initialized directly.

All the examples in the How to do it... section use direct initialization, but copy initialization is also possible with brace-initialization. The two forms, direct and copy initialization, may be equivalent in most cases, but copy initialization is less permissive because it does not consider explicit constructors in its implicit conversion sequence that must produce an object directly from the initializer, whereas direct initialization expects an implicit conversion from the initializer to an argument of the constructor. Dynamically allocated arrays can only be initialized using direct initialization.

Of the classes shown in the preceding examples, foo is the one class that has both a default constructor and a constructor with parameters. To use the default constructor to perform default initialization, we need to use empty braces, that is, {}. To use the constructor with parameters, we need to provide the values for all the arguments in braces {}. Unlike non-aggregate types where default initialization means invoking the default constructor, for aggregate types, default initialization means initializing with zeros.

Initialization of standard containers, such as the vector and the map also shown above, is possible because all standard containers have an additional constructor in C++11 that takes an argument of type std::initializer_list<T>. This is basically a lightweight proxy over an array of elements of type T const. These constructors then initialize the internal data from the values in the initializer list.

The way the initialization using std::initializer_list works is the following:

  • The compiler resolves the types of the elements in the initialization list (all elements must have the same type).
  • The compiler creates an array with the elements in the initializer list.
  • The compiler creates an std::initializer_list<T> object to wrap the previously created array.
  • The std::initializer_list<T> object is passed as an argument to the constructor.

An initializer list always takes precedence over other constructors where brace-initialization is used. If such a constructor exists for a class, it will be called when brace-initialization is performed:

    class foo 
    { 
      int a_; 
      int b_; 
    public: 
      foo() :a_(0), b_(0) {} 

      foo(int a, int b = 0) :a_(a), b_(b) {} 
      foo(std::initializer_list<int> l) {} 
    }; 

    foo f{ 1, 2 }; // calls constructor with initializer_list<int>

The precedence rule applies to any function, not just constructors. In the following example, two overloads of the same function exist. Calling the function with an initializer list resolves to a call to the overload with an std::initializer_list:

    void func(int const a, int const b, int const c) 
    { 
      std::cout << a << b << c << std::endl; 
    } 

    void func(std::initializer_list<int> const l) 
    { 
      for (auto const & e : l) 
        std::cout << e << std::endl; 
    } 

    func({ 1,2,3 }); // calls second overload

This, however, has the potential of leading to bugs. Let's take, for example, the vector type. Among the constructors of the vector, there is one that has a single argument representing the initial number of elements to be allocated and another one that has an std::initializer_list as an argument. If the intention is to create a vector with a preallocated size, using the brace-initialization will not work, as the constructor with the std::initializer_list will be the best overload to be called:

    std::vector<int> v {5};

The preceding code does not create a vector with five elements, but a vector with one element with a value 5. To be able to actually create a vector with five elements, initialization with the parentheses form must be used:

    std::vector<int> v (5);

Another thing to note is that brace-initialization does not allow narrowing conversion. According to the C++ standard (refer to paragraph 8.5.4 of the standard), a narrowing conversion is an implicit conversion:

- From a floating-point type to an integer type - From long double to double or float, or from double to float, except where the source is a constant expression and the actual value after conversion is within the range of values that can be represented (even if it cannot be represented exactly) - From an integer type or unscoped enumeration type to a floating-point type, except where the source is a constant expression and the actual value after conversion will fit into the target type and will produce the original value when converted to its original type - From an integer type or unscoped enumeration type to an integer type that cannot represent all the values of the original type, except where the source is a constant expression and the actual value after conversion will fit into the target type and will produce the original value when converted to its original type.

The following declarations trigger compiler errors because they require a narrowing conversion:

    int i{ 1.2 };           // error 

    double d = 47 / 13; 
    float f1{ d };          // error 
    float f2{47/13};        // OK

To fix the error, an explicit conversion must be done:

    int i{ static_cast<int>(1.2) }; 

    double d = 47 / 13; 
    float f1{ static_cast<float>(d) };

Note

A brace-initialization list is not an expression and does not have a type. Therefore, decltype cannot be used on a brace-init list, and template type deduction cannot deduce the type that matches a brace-init list.

There's more

The following sample shows several examples of direct-list-initialization and copy-list-initialization. In C++11, the deduced type of all these expressions is std::initializer_list<int>.

auto a = {42};   // std::initializer_list<int>
auto b {42};     // std::initializer_list<int>
auto c = {4, 2}; // std::initializer_list<int>
auto d {4, 2};   // std::initializer_list<int>

C++17 has changed the rules for list initialization, differentiating between the direct- and copy-list-initialization. The new rules for type deduction are as follows:

  • for copy list initialization auto deduction will deduce a std::initializer_list<T> if all elements in the list have the same type, or be ill-formed.
  • for direct list initialization auto deduction will deduce a T if the list has a single element, or be ill-formed if there is more than one element.

Base on the new rules, the previous examples would change as follows: a and c are deduced as std::initializer_list<int>; b is deduced as an int; d, which uses direct initialization and has more than one value in the brace-init-list, triggers a compiler error.

auto a = {42};   // std::initializer_list<int>
auto b {42};     // int
auto c = {4, 2}; // std::initializer_list<int>
auto d {4, 2};   // error, too many

See also

  • Using auto whenever possible
  • Understanding the various forms of non-static member initialization

 

 

Understanding the various forms of non-static member initialization


Constructors are a place where non-static class member initialization is done. Many developers prefer assignments in the constructor body. Aside from the several exceptional cases when that is actually necessary, initialization of non-static members should be done in the constructor's initializer list or, as of C++11, using default member initialization when they are declared in the class. Prior to C++11, constants and non-constant non-static data members of a class had to be initialized in the constructor. Initialization on declaration in a class was only possible for static constants. As we will see further, this limitation was removed in C++11 that allows initialization of non-statics in the class declaration. This initialization is called default member initialization and is explained in the next sections.

This recipe will explore the ways the non-static member initialization should be done.

How to do it...

To initialize non-static members of a class you should:

  • Use default member initialization for providing default values for members of classes with multiple constructors that would use a common initializer for those members (see [3] and [4] in the following code).
  • Use default member initialization for constants, both static and non-static (see [1] and [2] in the following code).
  • Use the constructor initializer list to initialize members that don't have default values, but depend on constructor parameters (see [5] and [6] in the following code).
  • Use assignment in constructors when the other options are not possible (examples include initializing data members with pointer this, checking constructor parameter values, and throwing exceptions prior to initializing members with those values or self-references of two non-static data members).

The following example shows these forms of initialization:

    struct Control 
    { 
      const int DefaultHeigh = 14;                  // [1]
      const int DefaultWidth = 80;                  // [2]

      TextVAligment valign = TextVAligment::Middle; // [3]
      TextHAligment halign = TextHAligment::Left;   // [4]

      std::string text; 

      Control(std::string const & t) : text(t)       // [5]
      {} 

      Control(std::string const & t, 
        TextVerticalAligment const va, 
        TextHorizontalAligment const ha):  
      text(t), valign(va), halign(ha)                 // [6]
      {} 
    };

How it works...

Non-static data members are supposed to be initialized in the constructor's initializer list as shown in the following example:

    struct Point 
    { 
      double X, Y; 
      Point(double const x = 0.0, double const y = 0.0) : X(x), Y(y)  {} 
    };

Many developers, however, do not use the initializer list, but prefer assignments in the constructor's body, or even mix assignments and the initializer list. That could be for several reasons--for larger classes with many members, the constructor assignments may look easier to read than long initializer lists, perhaps split on many lines, or it could be because they are familiar with other programming languages that don't have an initializer list or because, unfortunately, for various reasons they don't even know about it.

Note

It is important to note that the order in which non-static data members are initialized is the order in which they were declared in the class definition, and not the order of their initialization in a constructor initializer list. On the other hand, the order in which non-static data members are destroyed is the reversed order of construction.

Using assignments in the constructor is not efficient, as this can create temporary objects that are later discarded. If not initialized in the initializer list, non-static members are initialized via their default constructor and then, when assigned a value in the constructor's body, the assignment operator is invoked. This can lead to inefficient work if the default constructor allocates a resource (such as memory or a file) and that has to be deallocated and reallocated in the assignment operator:

    struct foo 
    { 
      foo()  
      { std::cout << "default constructor" << std::endl; } 
      foo(std::string const & text)  
      { std::cout << "constructor '" << text << "'" << std::endl; } 
      foo(foo const & other)
      { std::cout << "copy constructor" << std::endl; } 
      foo(foo&& other)  
      { std::cout << "move constructor" << std::endl; }; 
      foo& operator=(foo const & other)  
      { std::cout << "assignment" << std::endl; return *this; } 
      foo& operator=(foo&& other)  
      { std::cout << "move assignment" << std::endl; return *this;} 
      ~foo()  
      { std::cout << "destructor" << std::endl; } 
    }; 

    struct bar 
    { 
      foo f; 

      bar(foo const & value) 
      { 
        f = value; 
      } 
    }; 

    foo f; 
    bar b(f);

The preceding code produces the following output showing how data member f is first default initialized and then assigned a new value:

default constructor 
default constructor 
assignment 
destructor 
destructor

Changing the initialization from the assignment in the constructor body to the initializer list replaces the calls to the default constructor plus assignment operator with a call to the copy constructor:

    bar(foo const & value) : f(value) { }

Adding the preceding line of code produces the following output:

default constructor 
copy constructor 
destructor 
destructor

For those reasons, at least for other types than the built-in types (such as bool, char, int, float, double or pointers), you should prefer the constructor initializer list. However, to be consistent with your initialization style, you should always prefer the constructor initializer list when that is possible. There are several situations when using the initializer list is not possible; these include the following cases (but the list could be expanded with other cases):

  • If a member has to be initialized with a pointer or reference to the object that contains it, using the this pointer in the initialization list may trigger a warning with some compilers that it is used before the object is constructed.
  • If you have two data members that must contain references to each other.
  • If you want to test an input parameter and throw an exception before initializing a non-static data member with the value of the parameter.

Starting with C++11, non-static data members can be initialized when declared in the class. This is called default member initialization because it is supposed to represent initialization with default values. Default member initialization is intended for constants and for members that are not initialized based on constructor parameters (in other words members whose value does not depend on the way the object is constructed):

    enum class TextFlow { LeftToRight, RightToLeft }; 

    struct Control 
    { 
      const int DefaultHeight = 20; 
      const int DefaultWidth = 100; 

      TextFlow textFlow = TextFlow::LeftToRight; 
      std::string text; 

      Control(std::string t) : text(t) 
      {} 
    };

In the preceding example, DefaultHeight and DefaultWidth are both constants; therefore, the values do not depend on the way the object is constructed, so they are initialized when declared. The textFlow object is a non-constant non-static data member whose value also does not depend on the way the object is initialized (it could be changed via another member function), therefore, it is also initialized using default member initialization when it is declared. text, on the other hand, is also a non-constant non-static data member, but its initial value depends on the way the object is constructed and therefore it is initialized in the constructor's initializer list using a value passed as an argument to the constructor.

If a data member is initialized both with the default member initialization and constructor initializer list, the latter takes precedence and the default value is discarded. To exemplify this, let's again consider the foo class earlier and the following bar class that uses it:

    struct bar 
    { 
      foo f{"default value"}; 

      bar() : f{"constructor initializer"} 
      { 
      } 
    }; 

    bar b;

The output differs, in this case, as follows, because the value from the default initializer list is discarded, and the object is not initialized twice:

constructor
constructor initializer
destructor

Note

Using the appropriate initialization method for each member leads not only to more efficient code but also to better organized and more readable code.

 

Controlling and querying object alignment


C++11 provides standardized methods for specifying and querying the alignment requirements of a type (something that was previously possible only through compiler-specific methods). Controlling the alignment is important in order to boost performance on different processors and enable the use of some instructions that only work with data on particular alignments. For example, Intel SSE and Intel SSE2 require 16 bytes alignment of data, whereas for Intel Advanced Vector Extensions (or Intel AVX), it is highly recommended to use 32 bytes alignment. This recipe explores the alignas specifier for controlling the alignment requirements and the alignof operator that retrieves the alignment requirements of a type.

Getting ready

You should be familiar with what data alignment is and the way the compiler performs default data alignment. However, basic information about the latter is provided in the How it works... section.

How to do it...

  • To control the alignment of a type (both at the class level or data member level) or an object, use the alignas specifier:
        struct alignas(4) foo 
        { 
          char a; 
          char b; 
        }; 
        struct bar 
        { 
          alignas(2) char a; 
          alignas(8) int  b; 
        }; 
        alignas(8)   int a; 
        alignas(256) long b[4];
  • To query the alignment of a type, use the alignof operator:
        auto align = alignof(foo);

How it works...

Processors do not access memory one byte at a time, but in larger chunks of powers of twos (2, 4, 8, 16, 32, and so on). Owing to this, it is important that compilers align data in memory so that it can be easily accessed by the processor. Should this data be misaligned, the compiler has to do extra work for accessing data; it has to read multiple chunks of data, shift, and discard unnecessary bytes and combine the rest together.

C++ compilers align variables based on the size of their data type: 1 byte for bool and char, 2 bytes for short, 4 bytes for int, long and float, 8 bytes for double and long long, and so on. When it comes to structures or unions, the alignment must match the size of the largest member in order to avoid performance issues. To exemplify, let's consider the following data structures:

    struct foo1    // size = 1, alignment = 1 
    { 
      char a; 
    }; 

    struct foo2    // size = 2, alignment = 1 
    { 
      char a; 
      char b; 
    }; 

    struct foo3    // size = 8, alignment = 4 
    { 
      char a; 
      int  b; 
    };

foo1 and foo2 have different sizes, but the alignment is the same--that is, 1--because all data members are of type char, which has a size of 1. In structure foo3, the second member is an integer, whose size is 4. As a result, the alignment of members of this structure is done at addresses that are multiples of 4. To achieve that, the compiler introduces padding bytes. The structure foo3 is actually transformed into the following:

    struct foo3_ 
    { 
      char a;        // 1 byte 
      char _pad0[3]; // 3 bytes padding to put b on a 4-byte boundary 
      int  b;        // 4 bytes 
    };

Similarly, the following structure has a size of 32 bytes and an alignment of 8; that is because the largest member is a double whose size is 8. This structure, however, requires padding in several places to make sure that all members can be accessed at addresses that are multiples of 8:

    struct foo4 
    { 
      int a; 
      char b; 
      float c; 
      double d; 
      bool e; 
    };

The equivalent structure created by the compiler is as follows:

    struct foo4_ 
    { 
      int a;         // 4 bytes 
      char b;        // 1 byte 
      char _pad0[3]; // 3 bytes padding to put c on a 8-byte boundary  
      float c;       // 4 bytes 
      char _pad1[4]; // 4 bytes padding to put d on a 8-byte boundary 
      double d;      // 8 bytes 
      bool e;        // 1 byte 
      char _pad2[7]; // 7 bytes padding to make sizeof struct multiple of 8 
    };

In C++11, specifying the alignment of an object or type is done using the alignas specifier. This can take either an expression (an integral constant expression that evaluates to 0 or a valid value for an alignment), a type-id, or a parameter pack. The alignas specifier can be applied to the declaration of a variable or a class data member that does not represent a bit field, or to the declaration of a class, union, or enumeration. The type or object on which an alignas specification is applied will have the alignment requirement equal to the largest, greater than zero, expression of all alignas specifications used in the declaration.

There are several restrictions when using the alignas specifier:

  • The only valid alignments are the powers of two ( 1, 2, 4, 8, 16, 32, and so on). Any other values are illegal, and the program is considered ill-formed; that doesn't necessarily have to produce an error, as the compiler may choose to ignore the specification.
  • An alignment of 0 is always ignored.
  • If the largest alignas on a declaration is smaller than the natural alignment without any alignas specifier, then the program is also considered ill-formed.

In the following example, the alignas specifier is applied on a class declaration. The natural alignment without the alignas specifier would have been 1, but with alignas(4) it becomes 4:

    struct alignas(4) foo 
    { 
      char a; 
      char b; 
    };

In other words, the compiler transforms the preceding class into the following:

    struct foo 
    { 
      char a; 
      char b; 
      char _pad0[2]; 
    };

The alignas specifier can be applied both on the class declaration and the member data declarations. In this case, the strictest (that is, largest) value wins. In the following example, member a has a natural size of 1 and requires an alignment of 2; member b has a natural size of 4 and requires an alignment of 8, therefore, the strictest alignment would be 8. The alignment requirement of the entire class is 4, which is weaker (that is, smaller) than the strictest required alignment and therefore it will be ignored, though the compiler will produce a warning:

    struct alignas(4) foo 
    { 
      alignas(2) char a; 
      alignas(8) int  b; 
    };

The result is a structure that looks like this:

    struct foo 
    { 
      char a; 
      char _pad0[7]; 
      int b; 
      char _pad1[4]; 
    };

The alignas specifier can also be applied on variables. In the next example, variable a, that is an integer, is required to be placed in memory at a multiple of 8. The next variable, the array of 4 a, that is an integer, is required to be placed in memory at a multiple of 8. The next variable, the array of 4 longs, is required to be placed in memory at a multiple of 256. As a result, the compiler will introduce up to 244 bytes of padding between the two variables (depending on where in memory, at an address multiple of 8, the variable a is located):

    alignas(8)   int a;   
    alignas(256) long b[4]; 

    printf("%pn", &a); // eg. 0000006C0D9EF908 
    printf("%pn", &b); // eg. 0000006C0D9EFA00

Looking at the addresses, we can see that the address of a is indeed a multiple of 8, and the address of b is a multiple of 256 (hexadecimal 100).

To query the alignment of a type, we use the alignof operator. Unlike sizeof, this operator can only be applied to type-ids, and not on variables or class data members. The types on which it can be applied can be complete types, an array type, or a reference type. For arrays, the value returned is the alignment of the element type; for references, the value returned is the alignment of the referenced type. Here are several examples:

Expression

Evaluation

alignof(char)

1, because the natural alignment of char is 1

alignof(int)

4, because the natural alignment of int is 4

alignof(int*)

4 on 32-bit, 8 on 64-bit, the alignment for pointers

alignof(int[4])

4, because the natural alignment of the element type is 4

alignof(foo&)

8, because the specified alignment for class foo that is the referred type (as shown in the last example) was 8

 

Using scoped enumerations


Enumeration is a basic type in C++ that defines a collection of values, always of an integral underlying type. Their named values, that are constant, are called enumerators. Enumerations declared with keyword enum are called unscoped enumerations and enumerations declared with enum class or enum struct are called scoped enumerations. The latter ones were introduced in C++11 and are intended to solve several problems of the unscoped enumerations.

How to do it...

  • Prefer to use scoped enumerations instead of unscoped ones.
  • In order to use scoped enumerations, you should declare enumerations using enum class or enum struct:
        enum class Status { Unknown, Created, Connected };
        Status s = Status::Created;

Note

The enum class and enum struct declarations are equivalent, and throughout this recipe and the rest of the book, we will use enum class.

How it works...

Unscoped enumerations have several issues that are creating problems for developers:

  • They export their enumerators to the surrounding scope (for which reason, they are called unscoped enumerations), and that has the following two drawbacks: it can lead to name clashes if two enumerations in the same namespace have enumerators with the same name, and it's not possible to use an enumerator using its fully qualified name:
        enum Status {Unknown, Created, Connected};
        enum Codes {OK, Failure, Unknown};   // error 
        auto status = Status::Created;       // error
  • Prior to C++ 11, they could not specify the underlying type that is required to be an integral type. This type must not be larger than int, unless the enumerator value cannot fit a signed or unsigned integer. Owing to this, forward declaration of enumerations was not possible. The reason was that the size of the enumeration was not known since the underlying type was not known until values of the enumerators were defined so that the compiler could pick the appropriate integer type. This has been fixed in C++11.
  • Values of enumerators implicitly convert to int. That means you can intentionally or accidentally mix enumerations that have a certain meaning and integers (that may not even be related to the meaning of the enumeration) and the compiler will not be able to warn you:
        enum Codes { OK, Failure }; 
        void include_offset(int pixels) {/*...*/} 
        include_offset(Failure);

The scoped enumerations are basically strongly typed enumerations that behave differently than the unscoped enumerations:

  • They do not export their enumerators to the surrounding scope. The two enumerations shown earlier would change to the following, no longer generating a name collision and being possible to fully qualify the names of the enumerators:
        enum class Status { Unknown, Created, Connected }; 
        enum class Codes { OK, Failure, Unknown }; // OK 
        Codes code = Codes::Unknown;               // OK

 

  • You can specify the underlying type. The same rules for underlying types of unscoped enumerations apply to scoped enumerations too, except that the user can specify explicitly the underlying type. This also solves the problem with forward declarations since the underlying type can be known before the definition is available:
        enum class Codes : unsigned int; 

        void print_code(Codes const code) {} 

        enum class Codes : unsigned int 
        {  
           OK = 0,  
           Failure = 1,  
           Unknown = 0xFFFF0000U 
        };
  • Values of scoped enumerations no longer convert implicitly to int. Assigning the value of an enum class to an integer variable would trigger a compiler error unless an explicit cast is specified:
        Codes c1 = Codes::OK;                       // OK 
        int c2 = Codes::Failure;                    // error 
        int c3 = static_cast<int>(Codes::Failure);  // OK
 

Using override and final for virtual methods


Unlike other similar programming languages, C++ does not have a specific syntax for declaring interfaces (that are basically classes with pure virtual methods only) and also has some deficiencies related to how virtual methods are declared. In C++, the virtual methods are introduced with the virtual keyword. However, the keyword virtual is optional for declaring overrides in derived classes that can lead to confusion when dealing with large classes or hierarchies. You may need to navigate throughout the hierarchy up to the base to figure out whether a function is virtual or not. On the other hand, sometimes, it is useful to make sure that a virtual function or even a derived class can no longer be overridden or derived further. In this recipe, we will see how to use C++11 special identifiers override and final to declare virtual functions or classes.

Getting ready

You should be familiar with inheritance and polymorphism in C++ and concepts, such as abstract classes, pure specifiers, virtual, and overridden methods.

How to do it...

To ensure correct declaration of virtual methods both in base and derived classes, but also increase readability, do the following:

  • Always use the virtual keyword when declaring virtual functions in derived classes that are supposed to override virtual functions from a base class, and
  • Always use the override special identifier after the declarator part of a virtual function declaration or definition.
        class Base 
        { 
          virtual void foo() = 0;
          virtual void bar() {} 
          virtual void foobar() = 0; 
        };

        void Base::foobar() {}

        class Derived1 : public Base 
        { 
          virtual void foo() override = 0;
          virtual void bar() override {}
          virtual void foobar() override {} 
        }; 

        class Derived2 : public Derived1 
        { 
          virtual void foo() override {} 
        };

Note

The declarator is the part of the type of a function that excludes the return type.

To ensure that functions cannot be overridden further or classes cannot be derived any more, use the final special identifier:

  • After the declarator part of a virtual function declaration or definition to prevent further overrides in a derived class:
        class Derived2 : public Derived1 
        { 
          virtual void foo() final {} 
        };
  • After the name of a class in the declaration of the class to prevent further derivations of the class:
        class Derived4 final : public Derived1 
        { 
          virtual void foo() override {} 
        };

How it works...

The way override works is very simple; in a virtual function declaration or definition, it ensures that the function is actually overriding a base class function, otherwise, the compiler will trigger an error.

It should be noted that both override and final keywords are special identifiers having a meaning only in a member function declaration or definition. They are not reserved keywords and can still be used elsewhere in a program as user-defined identifiers.

Using the override special identifier helps the compiler to detect situations when a virtual method does not override another one like shown in the following example:

    class Base 
    { 
    public: 
      virtual void foo() {}
      virtual void bar() {}
    }; 

    class Derived1 : public Base 
    { 
    public:    
      void foo() override {}
      // for readability use the virtual keyword    

      virtual void bar(char const c) override {}
      // error, no Base::bar(char const) 
    };

The other special identifier, final, is used in a member function declaration or definition to indicate that the function is virtual and cannot be overridden in a derived class. If a derived class attempts to override the virtual function, the compiler triggers an error:

    class Derived2 : public Derived1 
    { 
      virtual void foo() final {} 
    }; 

    class Derived3 : public Derived2 
    { 
      virtual void foo() override {} // error 
    };

The final specifier can also be used in a class declaration to indicate that it cannot be derived:

    class Derived4 final : public Derived1 
    { 
      virtual void foo() override {} 
    };

    class Derived5 : public Derived4 // error 
    { 
    };

Since both override and final have this special meaning when used in the defined context and are not in fact reserved keywords, you can still use them anywhere elsewhere in the C++ code. This ensured that existing code written before C++11 did not break because of the use of these names for identifiers:

    class foo 
    { 
      int final = 0; 
      void override() {} 
    };

 

 

Using range-based for loops to iterate on a range


Many programming languages support a variant of a for loop called for each, that is, repeating a group of statements over the elements of a collection. C++ did not have core language support for this until C++11. The closest feature was the general purpose algorithm from the standard library called std::for_each, that applies a function to all the elements in a range. C++11 brought language support for for each that is actually called range-based for loops. The new C++17 standard provides several improvements to the original language feature.

Getting ready

In C++11, a range-based for loop has the following general syntax:

    for ( range_declaration : range_expression ) loop_statement

To exemplify the various ways of using a range-based for loops, we will use the following functions that return sequences of elements:

    std::vector<int> getRates() 
    { 
      return std::vector<int> {1, 1, 2, 3, 5, 8, 13}; 
    } 

    std::multimap<int, bool> getRates2() 
    { 
      return std::multimap<int, bool> { 
        { 1, true }, 
        { 1, true }, 
        { 2, false }, 
        { 3, true }, 
        { 5, true }, 
        { 8, false }, 
        { 13, true } 
      }; 
    }

 

 

How to do it...

Range-based for loops can be used in various ways:

  • By committing to a specific type for the elements of the sequence:
        auto rates = getRates();
        for (int rate : rates) 
          std::cout << rate << std::endl; 
        for (int& rate : rates) 
          rate *= 2;
  • By not specifying a type and letting the compiler deduce it:
        for (auto&& rate : getRates()) 
          std::cout << rate << std::endl; 

        for (auto & rate : rates) 
          rate *= 2; 

        for (auto const & rate : rates) 
          std::cout << rate << std::endl;
  • By using structured bindings and decomposition declaration in C++17:
        for (auto&& [rate, flag] : getRates2()) 
          std::cout << rate << std::endl;

How it works...

The expression for the range-based for loops shown earlier in the How to do it... section is basically syntactic sugar as the compiler transforms it into something else. Before C++17, the code generated by the compiler used to be the following:

    { 
      auto && __range = range_expression; 
      for (auto __begin = begin_expr, __end = end_expr; 
      __begin != __end; ++__begin) { 
        range_declaration = *__begin; 
        loop_statement 
      } 
    }

What begin_expr and end_expr are in this code depends on the type of the range:

  • For C-like arrays, __range and __bound are the number of elements in the array.
  • For a class type with begin() and end() members (regardless of their type and accessibility): __range.begin() and __range.end().
  • For others it is begin(__range) and end(__range) that are determined via argument dependent lookup.

It is important to note that if a class contains any members (function, data member, or enumerators) called begin or end, regardless of their type and accessibility, they will be picked for begin_expr and end_expr. Therefore, such a class type cannot be used in range-based for loops.

In C++17, the code generated by the compiler is slightly different:

    { 
      auto && __range = range_expression; 
      auto __begin = begin_expr; 
      auto __end = end_expr; 
      for (; __begin != __end; ++__begin) { 
        range_declaration = *__begin; 
        loop_statement 
      } 
    }

The new standard has removed the constraint that the begin expression and end expression must have the same type. The end expression does not need to be an actual iterator, but it has to be able to be compared for inequality with an iterator. A benefit of this is that the range can be delimited by a predicate.

See also

  • Enabling range-based for loops for custom types

 

 

Enabling range-based for loops for custom types


As we have seen in the preceding recipe, the range-based for loops, known as for each in other programming languages, allows you to iterate over the elements of a range, providing a simplified syntax over the standard for loops and making the code more readable in many situations. However, range-based for loops do not work out of the box with any type representing a range, but require the presence of a begin() and end() function (for non-array types) either as a member or free function. In this recipe, we will see how to enable a custom type to be used in range-based for loops.

Getting ready

It is recommended that you read the recipe Using range-based for loops to iterate on a range before continuing with this one if you need to understand how range-based for loops work and what is the code the compiler generates for such a loop.

To show how we can enable range-based for loops for custom types representing sequences, we will use the following implementation of a simple array:

    template <typename T, size_t const Size> 
    class dummy_array 
    { 
      T data[Size] = {}; 

    public: 
      T const & GetAt(size_t const index) const 
      { 
        if (index < Size) return data[index]; 
        throw std::out_of_range("index out of range"); 
      } 

      void SetAt(size_t const index, T const & value) 
      { 
        if (index < Size) data[index] = value; 
        else throw std::out_of_range("index out of range"); 
      } 

      size_t GetSize() const { return Size; } 
    };

The purpose of this recipe is to enable writing code like the following:

    dummy_array<int, 3> arr; 
    arr.SetAt(0, 1); 
    arr.SetAt(1, 2); 
    arr.SetAt(2, 3); 

    for(auto&& e : arr) 
    {  
      std::cout << e << std::endl; 
    }

How to do it...

To enable a custom type to be used in range-based for loops, you need to do the following:

  • Create mutable and constant iterators for the type that must implement the following operators:
    • operator++ for incrementing the iterator.
    • operator* for dereferencing the iterator and accessing the actual element pointed by the iterator.
    • operator!= for comparing with another iterator for inequality.
  • Provide free begin() and end() functions for the type.

Given the earlier example of a simple range, we need to provide the following:

  1. The following minimal implementation of an iterator class:
        template <typename T, typename C, size_t const Size> 
        class dummy_array_iterator_type 
        { 
        public: 
          dummy_array_iterator_type(C& collection,  
                                    size_t const index) : 
          index(index), collection(collection) 
          { } 

        bool operator!= (dummy_array_iterator_type const & other) const 
        { 
          return index != other.index; 
        } 

        T const & operator* () const 
        { 
          return collection.GetAt(index); 
        } 

        dummy_array_iterator_type const & operator++ () 
        { 
          ++index; 
          return *this; 
        } 

        private: 
          size_t   index; 
          C&       collection; 
        };
  1. Alias templates for mutable and constant iterators:
        template <typename T, size_t const Size> 
        using dummy_array_iterator =  
           dummy_array_iterator_type< 
             T, dummy_array<T, Size>, Size>; 

        template <typename T, size_t const Size> 
        using dummy_array_const_iterator =  
           dummy_array_iterator_type< 
             T, dummy_array<T, Size> const, Size>;
  1. Free begin() and end() functions that return the corresponding begin and end iterators, with overloads for both alias templates:
        template <typename T, size_t const Size> 
        inline dummy_array_iterator<T, Size> begin(
          dummy_array<T, Size>& collection) 
        { 
          return dummy_array_iterator<T, Size>(collection, 0); 
        } 

        template <typename T, size_t const Size> 
        inline dummy_array_iterator<T, Size> end(
          dummy_array<T, Size>& collection) 
        { 
          return dummy_array_iterator<T, Size>(
            collection, collection.GetSize()); 
        } 

        template <typename T, size_t const Size> 
        inline dummy_array_const_iterator<T, Size> begin( 
          dummy_array<T, Size> const & collection) 
        { 
          return dummy_array_const_iterator<T, Size>( 
            collection, 0); 
        } 

        template <typename T, size_t const Size> 
        inline dummy_array_const_iterator<T, Size> end( 
          dummy_array<T, Size> const & collection) 
        { 
          return dummy_array_const_iterator<T, Size>( 
            collection, collection.GetSize()); 
        }

How it works...

Having this implementation available, the range-based for loop shown earlier compiles and executes as expected. When performing argument dependent lookup, the compiler will identify the two begin() and end() functions that we wrote (that take a reference to a dummy_array) and therefore the code it generates becomes valid.

In the preceding example, we have defined one iterator class template and two alias templates, called dummy_array_iterator and dummy_array_const_iterator. The begin() and end() functions both have two overloads for these two types of iterators. This is necessary so that the container we have considered could be used in range-based for loops with both constant and non-constant instances:

    template <typename T, const size_t Size> 
    void print_dummy_array(dummy_array<T, Size> const & arr) 
    { 
      for (auto && e : arr) 
      { 
        std::cout << e << std::endl; 
      } 
    }

A possible alternative to enable range-based for loops for the simple range class we considered for this recipe is to provide member begin() and end() functions. In general, that could make sense only if you own and can modify the source code. On the other hand, the solution shown in this recipe works in all cases and should be preferred to other alternatives.

See also

  • Creating type aliases and alias templates
 

Using explicit constructors and conversion operators to avoid implicit conversion


Before C++11, a constructor with a single parameter was considered a converting constructor. With C++11, every constructor without the explicit specifier is considered a converting constructor. Such a constructor defines an implicit conversion from the type or types of its arguments to the type of the class. Classes can also define converting operators that convert the type of the class to another specified type. All these are useful in some cases, but can create problems in other cases. In this recipe, we will see how to use explicit constructors and conversion operators.

Getting ready

For this recipe, you need to be familiar with converting constructors and converting operators. In this recipe, you will learn how to write explicit constructors and conversion operators to avoid implicit conversions to and from a type. The use of explicit constructors and conversion operators (called user-defined conversion functions) enables the compiler to yield errors--that in some cases are coding errors--and allow developers to spot those errors quickly and fix them.

How to do it...

To declare explicit constructors and conversion operators (regardless of whether they are functions or function templates), use the explicit specifier in the declaration.

The following example shows both an explicit constructor and a converting operator:

    struct handle_t 
    { 
      explicit handle_t(int const h) : handle(h) {} 

      explicit operator bool() const { return handle != 0; }; 
    private: 
      int handle; 
    };

How it works...

To understand why explicit constructors are necessary and how they work, we will first look at converting constructors. The following class has three constructors: a default constructor (without parameters), a constructor that takes an int, and a constructor that takes two parameters, an int and a double. They don't do anything, except printing a message. As of C++11, these are all considered converting constructors. The class also has a conversion operator that converts the type to a bool:

    struct foo 
    { 
      foo()
      { std::cout << "foo" << std::endl; }
      foo(int const a)
      { std::cout << "foo(a)" << std::endl; }
      foo(int const a, double const b)
      { std::cout << "foo(a, b)" << std::endl; } 

      operator bool() const { return true; } 
    };

Based on this, the following definitions of objects are possible (note that the comments represent the console output):

    foo f1;              // foo
    foo f2 {};           // foo

    foo f3(1);           // foo(a)
    foo f4 = 1;          // foo(a)
    foo f5 { 1 };        // foo(a)
    foo f6 = { 1 };      // foo(a)

    foo f7(1, 2.0);      // foo(a, b)
    foo f8 { 1, 2.0 };   // foo(a, b)
    foo f9 = { 1, 2.0 }; // foo(a, b)

f1 and f2 invoke the default constructor. f3, f4, f5, and f6 invoke the constructor that takes an int. Note that all the definitions of these objects are equivalent, even if they look different (f3 is initialized using the functional form, f4 and f6 are copy initialized, and f5 is directly initialized using brace-init-list). Similarly, f7, f8, and f9 invoke the constructor with two parameters.

It may be important to note that if foo defines a constructor that takes an std::initializer_list, then all the initializations using {} would resolve to that constructor:

    foo(std::initializer_list<int> l)  
    { std::cout << "foo(l)" << std::endl; }

In this case, f5 and f6 will print foo(l), while f8 and f9 will generate compiler errors because all elements of the initializer list should be integers.

These may all look right, but the implicit conversion constructors enable scenarios where the implicit conversion may not be what we wanted:

    void bar(foo const f) 
    { 
    } 

    bar({});             // foo()
    bar(1);              // foo(a)
    bar({ 1, 2.0 });     // foo(a, b)

The conversion operator to bool in the example above also enables us to use foo objects where boolean values are expected:

    bool flag = f1; 
    if(f2) {} 
    std::cout << f3 + f4 << std::endl; 
    if(f5 == f6) {}

The first two are examples where foo is expected to be used as boolean but the last two with addition and test for equality are probably incorrect, as we most likely expect to add foo objects and test foo objects for equality, not the booleans they implicitly convert to.

Perhaps a more realistic example to understand where problems could arise would be to consider a string buffer implementation. This would be a class that contains an internal buffer of characters. The class may provide several conversion constructors: a default constructor, a constructor that takes a size_t parameter representing the size of the buffer to preallocate, and a constructor that takes a pointer to char that should be used to allocate and initialize the internal buffer. Succinctly, such a string buffer could look like this:

    class string_buffer 
    { 
    public: 
      string_buffer() {} 

      string_buffer(size_t const size) {} 

      string_buffer(char const * const ptr) {} 

      size_t size() const { return ...; } 
      operator bool() const { return ...; } 
      operator char * const () const { return ...; } 
    };

Based on this definition, we could construct the following objects:

    std::shared_ptr<char> str; 
    string_buffer sb1;             // empty buffer 
    string_buffer sb2(20);         // buffer of 20 characters 
    string_buffer sb3(str.get());   
    // buffer initialized from input parameter

sb1 is created using the default constructor and thus has an empty buffer; sb2 is initialized using the constructor with a single parameter and the value of the parameter represents the size in characters of the internal buffer; sb3 is initialized with an existing buffer and that is used to define the size of the internal buffer and to copy its value into the internal buffer. However, the same definition also enables the following object definitions:

    enum ItemSizes {DefaultHeight, Large, MaxSize}; 

    string_buffer b4 = 'a'; 
    string_buffer b5 = MaxSize;

In this case, b4 is initialized with a char. Since an implicit conversion to size_t exists, the constructor with a single parameter will be called. The intention here is not necessarily clear; perhaps it should have been "a" instead of 'a', in which case the third constructor would have been called. However, b5 is most likely an error, because MaxSize is an enumerator representing an ItemSizes and should have nothing to do with a string buffer size. These erroneous situations are not flagged by the compiler in any way.

Using the explicit specifier in the declaration of a constructor, that constructor becomes an explicit constructor and no longer allows implicit constructions of objects of a class type. To exemplify this, we will slightly change the string_buffer class earlier to declare all constructors explicit:

    class string_buffer 
    { 
    public: 
      explicit string_buffer() {} 

      explicit string_buffer(size_t const size) {} 

      explicit string_buffer(char const * const ptr) {} 

      explicit operator bool() const { return ...; } 
      explicit operator char * const () const { return ...; } 
    };

The change is minimal, but the definitions of b4 and b5 in the earlier example no longer work, and are incorrect, since the implicit conversion from char or int to size_t are no longer available during overload resolution to figure out what constructor should be called. The result is compiler errors for both b4 and b5. Note that b1, b2, and b3 are still valid definitions even if the constructors are explicit.

The only way to fix the problem, in this case, is to provide an explicit cast from char or int to string_buffer:

    string_buffer b4 = string_buffer('a'); 
    string_buffer b5 = static_cast<string_buffer>(MaxSize); 
    string_buffer b6 = string_buffer{ "a" };

With explicit constructors, the compiler is able to immediately flag erroneous situations and developers can react accordingly, either fixing the initialization with a correct value or providing an explicit cast.

Note

This is only the case when initialization is done with copy initialization and not when using the functional or universal initialization.

The following definitions are still possible (and wrong) with explicit constructors:

    string_buffer b7{ 'a' }; 
    string_buffer b8('a');

Similar to constructors, conversion operators can be declared explicit (as shown earlier). In this case, the implicit conversions from the object type to the type specified by the conversion operator are no longer possible and require an explicit cast. Considering b1 and b2, the string_buffer objects defined earlier, the following are no longer possible with explicit conversion operator bool:

    std::cout << b1 + b2 << std::endl; 
    if(b1 == b2) {}

Instead, they require explicit conversion to bool:

    std::cout << static_cast<bool>(b1) + static_cast<bool>(b2);
    if(static_cast<bool>(b1) == static_cast<bool>(b2)) {}

See also

  • Understanding uniform initialization

 

 

Using unnamed namespaces instead of static globals


The larger a program the greater the chances are you could run into name collisions with file locals when your program is linked. Functions or variables that are declared in a source file and are supposed to be local to the translation unit may collide with other similar functions or variables declared in another translation unit. That is because all symbols that are not declared static have external linkage and their names must be unique throughout the program. The typical C solution for this problem is to declare those symbols static, changing their linkage from external to internal and therefore making them local to a translation unit. In this recipe, we will look at the C++ solution for this problem.

Getting ready

In this recipe, we will discuss concepts such as global functions, static functions, and variables, namespaces, and translation units. Apart from these, it is required that you understand the difference between internal and external linkage; that is key for this recipe.

How to do it...

When you are in a situation where you need to declare global symbols as statics to avoid linkage problems, prefer to use unnamed namespaces:

  1. Declare a namespace without a name in your source file.
  2. Put the definition of the global function or variable in the unnamed namespace without making them static.

The following example shows two functions called print() in two different translation units; each of them is defined in an unnamed namespace:

    // file1.cpp 
    namespace 
    { 
      void print(std::string message) 
      { 
        std::cout << "[file1] " << message << std::endl; 
      } 
    } 

    void file1_run() 
    { 
      print("run"); 
    } 

    // file2.cpp 
    namespace 
    { 
      void print(std::string message) 
      { 
        std::cout << "[file2] " << message << std::endl; 
      } 
    } 

    void file2_run() 
    { 
      print("run"); 
    }

How it works...

When a function is declared in a translation unit, it has external linkage. That means two functions with the same name from two different translation units would generate a linkage error because it is not possible to have two symbols with the same name. The way this problem is solved in C, and by some in C++ also, is to declare the function or variable static and change its linkage from external to internal. In this case, its name is no longer exported outside the translation unit, and the linkage problem is avoided.

The proper solution in C++ is to use unnamed namespaces. When you define a namespace like the ones shown above, the compiler transforms it to the following:

    // file1.cpp 
    namespace _unique_name_ {} 
    using namespace _unique_name_; 
    namespace _unique_name_ 
    { 
      void print(std::string message) 
      { 
        std::cout << "[file1] " << message << std::endl; 
      } 
    } 

    void file1_run() 
    { 
      print("run"); 
    }

First of all, it declares a namespace with a unique name (what the name is and how it generates that name is a compiler implementation detail and should not be a concern). At this point, the namespace is empty, and the purpose of this line is to basically establish the namespace. Second, a using directive brings everything from the _unique_name_ namespace into the current namespace. Third, the namespace, with the compiler-generated name, is defined as it was in the original source code (when it had no name).

By defining the translation unit local print() functions in an unnamed namespace, they have local visibility only, yet their external linkage no longer produces linkage errors since they now have external unique names.

Unnamed namespaces are also working in a perhaps more obscure situation involving templates. Template arguments cannot be names with internal linkage, so using static variables is not possible. On the other hand, symbols in an unnamed namespace have external linkage and can be used as template arguments. This problem is shown in the following example where declaring t1 produces a compiler error because the non-type argument expression has internal linkage. However, t2 is correct because Size2 has external linkage:

    template <int const& Size> 
    class test {}; 

    static int Size1 = 10; 

    namespace 
    { 
      int Size2 = 10; 
    } 

    test<Size1> t1; 
    test<Size2> t2;

See also

  • Using inline namespaces for symbol versioning

 

 

Using inline namespaces for symbol versioning


The C++11 standard has introduced a new type of namespace called inline namespaces that are basically a mechanism that makes declarations from a nested namespace look and act like they were part of the surrounding namespace. Inline namespaces are declared using the inline keyword in the namespace declaration (unnamed namespaces can also be inlined). This is a helpful feature for library versioning, and in this recipe, we will see how inline namespaces can be used for versioning symbols. From this recipe, you will learn how to version your source code using inline namespaces and conditional compilation.

Getting ready

In this recipe, we will discuss namespaces and nested namespaces, templates and template specializations, and conditional compilation using preprocessor macros. Familiarity with these concepts is required in order to proceed with the recipe.

How to do it...

To provide multiple versions of a library and let the user decide what version to use, do the following:

  • Define the content of the library inside a namespace.
  • Define each version of the library or parts of it inside an inner inline namespace.
  • Use preprocessor macros and #if directives to enable a particular version of the library.

The following example shows a library that has two versions that clients can use:

    namespace modernlib 
    { 
      #ifndef LIB_VERSION_2 
      inline namespace version_1 
      { 
        template<typename T> 
        int test(T value) { return 1; } 
      } 
      #endif 

      #ifdef LIB_VERSION_2 
      inline namespace version_2 
      { 
        template<typename T> 
        int test(T value) { return 2; } 
      } 
      #endif 
    }

How it works...

A member of an inline namespace is treated as if it was a member of the surrounding namespace. Such a member can be partially specialized, explicitly instantiated, or explicitly specialized. This is a transitive property, which means that if a namespace A contains an inline namespace B that contains an inline namespace C, then the members of C appear as they were members of both B and A and the members of B appear as they were members of A.

To better understand why inline namespaces are helpful, let's consider the case of developing a library that evolves over time from a first version to a second version (and further on). This library defines all its types and functions under a namespace called modernlib. In the first version, this library could look like this:

    namespace modernlib 
    { 
      template<typename T> 
      int test(T value) { return 1; } 
    }

A client of the library can make the following call and get back the value 1:

    auto x = modernlib::test(42);

However, the client might decide to specialize the template function test() as the following:

    struct foo { int a; }; 

    namespace modernlib 
    { 
      template<> 
      int test(foo value) { return value.a; } 
    } 

    auto y = modernlib::test(foo{ 42 });

In this case, the value of y is no longer 1, but 42 because the user-specialized function gets called.

Everything is working correctly so far, but as a library developer you decide to create a second version of the library, yet still ship both the first and the second version and let the user control what to use with a macro. In this second version, you provide a new implementation of the test() function that no longer returns 1, but 2. To be able to provide both the first and second implementations, you put them in nested namespaces called version_1 and version_2 and conditionally compile the library using preprocessor macros:

    namespace modernlib 
    { 
      namespace version_1 
      { 
        template<typename T> 
        int test(T value) { return 1; } 
      } 

      #ifndef LIB_VERSION_2 
      using namespace version_1; 
      #endif 

      namespace version_2  
      { 
        template<typename T> 
        int test(T value) { return 2; } 
      } 

      #ifdef LIB_VERSION_2 
      using namespace version_2; 
      #endif 
    }

Suddenly, the client code will break, regardless of whether it uses the first or second version of the library. That is because the test function is now inside a nested namespace, and the specialization for foo is done in the modernlib namespace, when it should actually be done in modernlib::version_1 or modernlib::version_2. This is because the specialization of a template is required to be done in the same namespace where the template was declared. In this case, the client needs to change the code like this:

    #define LIB_VERSION_2 

    #include "modernlib.h" 

    struct foo { int a; }; 

    namespace modernlib 
    { 
      namespace version_2  
      { 
        template<> 
        int test(foo value) { return value.a; } 
      } 
    }

This is a problem because the library leaks implementation details, and the client needs to be aware of those in order do the template specialization. These internal details are hidden with inline namespaces in the manner shown in the How to do it... section of this recipe. With that definition of the modernlib library, the client code with the specialization of the test() function in the modernlib namespace is no longer broken, because either version_1::test() or version_2::test() (depending on what version the client is actually using) acts as being part of the enclosing modernlib namespace when template specialization is done. The details of the implementation are now hidden to the client that only sees the surrounding namespace modernlib.

However, you should keep in mind that:

  • The namespace std is reserved for the standard and should never be inlined.
  • A namespace should not be defined inline if it was not inline in its first definition.

See also

  • Using unnamed namespaces instead of static globals
  • Conditionally compiling your source code recipe of Chapter 4, Preprocessor and Compilation

 

 

Using structured bindings to handle multi-return values


Returning multiple values from a function is something very common, yet there is no first-class solution in C++ to enable it directly. Developers have to choose between returning multiple values through reference parameters to a function, defining a structure to contain the multiple values or returning a std::pair or std::tuple. The first two use named variables that have the advantage that they clearly indicate the meaning of the return value, but have the disadvantage that they have to be explicitly defined. std::pair has its members called first and second, and std::tuple has unnamed members that can only be retrieved with a function call, but can be copied to named variables using std::tie(). None of these solutions is ideal.

C++17 extends the semantic use of std::tie() into a first-class core language feature that enables unpacking the values of a tuple to named variables. This feature is called structured bindings.

Getting ready

For this recipe, you should be familiar with the standard utility types std::pair and std::tuple and the utility function std::tie().

How to do it...

To return multiple values from a function using a compiler that supports C++17 you should do the following:

  1. Use an std::tuple for the return type.
        std::tuple<int, std::string, double> find() 
        { 
          return std::make_tuple(1, "marius", 1234.5); 
        }
  1. Use structured bindings to unpack the values of the tuple into named objects.
        auto [id, name, score] = find();

 

  1. Use decomposition declaration to bind the returned values to variables inside an if statement or switch statement.
        if (auto [id, name, score] = find(); score > 1000) 
        { 
          std::cout << name << std::endl; 
        }

How it works...

Structured bindings are a language feature that works just like std::tie(), except that we don't have to define named variables for each value that needs to be unpacked explicitly with std::tie(). With structured bindings, we define all named variables in a single definition using the auto specifier so that the compiler can infer the correct type for each variable.

To exemplify this, let's consider the case of inserting items in a std::map. The insert method returns a std::pair containing an iterator to the inserted element or the element that prevented the insertion, and a boolean indicating whether the insertion was successful or not. The following code is very explicit and the use of second or first->second makes the code harder to read because you need to constantly figure out what they represent:

    std::map<int, std::string> m; 

    auto result = m.insert({ 1, "one" }); 
    std::cout << "inserted = " << result.second << std::endl 
              << "value = " << result.first->second << std::endl;

The preceding code can be made more readable with the use of std::tie, that unpacks tuples into individual objects (and works with std::pair because std::tuple has a converting assignment from std::pair):

    std::map<int, std::string> m; 
    std::map<int, std::string>::iterator it; 
    bool inserted; 

    std::tie(it, inserted) = m.insert({ 1, "one" }); 
    std::cout << "inserted = " << inserted << std::endl 
              << "value = " << it->second << std::endl; 

    std::tie(it, inserted) = m.insert({ 1, "two" }); 
    std::cout << "inserted = " << inserted << std::endl 
              << "value = " << it->second << std::endl;

The code is not necessarily simpler because it requires defining in advance the objects that the pair is unpacked to. Similarly, the more elements the tuple has the more objects you need to define, but using named objects makes the code easier to read.

C++17 structured bindings elevate the unpacking of tuple elements into named objects to the rank of a language feature; it does not require the use of std::tie(), and objects are initialized when declared:

    std::map<int, std::string> m; 
    { 
      auto[it, inserted] = m.insert({ 1, "one" }); 
      std::cout << "inserted = " << inserted << std::endl 
                << "value = " << it->second << std::endl; 
    } 

    { 
      auto[it, inserted] = m.insert({ 1, "two" }); 
      std::cout << "inserted = " << inserted << std::endl 
                << "value = " << it->second << std::endl; 
    }

The use of multiple blocks in the above example is necessary because variables cannot be redeclared in the same block, and structured bindings imply a declaration using the auto specifier. Therefore, if you need to make multiple calls like in the example above and use structured bindings you must either use different variable names or multiple blocks as shown above. An alternative to that is to avoid structured bindings and use std::tie(), because it can be called multiple times with the same variables, therefore you only need to declare them once.

In C++17, it is also possible to declare variables in if and switch statements with the form if(init; condition) and switch(init; condition). This could be combined with structured bindings to produce simpler code. In the following example, we attempt to insert a new value into a map. The result of the call is unpacked into two variables, it and inserted, defined in the scope of the if statement in the initialization part. The condition of the if statement is evaluated from the value of the inserted object:

    if(auto [it, inserted] = m.insert({ 1, "two" }); inserted)
    { std::cout << it->second << std::endl; }

About the Author

  • Marius Bancila

    Marius Bancila is a software engineer with 15 years of experience in developing solutions for the industrial and financial sectors. He is the author of Modern C++ Programming Cookbook. He focuses on Microsoft technologies and mainly develops desktop applications with C++ and C#.

    Marius is passionate about sharing his technical expertise with others, and for that reason, he used to be recognized as a Microsoft MVP for more than a decade. Marius can be found on Twitter at @mariusbancila.

    Browse publications by this author

Latest Reviews

(20 reviews total)
Well written, well organize.
Comme j'ai mentionné plus haut, je n'ai pas vraiment eu le temps de bien évaluer le contenu de ce livre. Selon ce que j'ai pu voir déjà, c'est excellent. Pour un programmeur qui connait déjà C++ mais qui veut apprendre les éléments plus nouveaux du langage, ce "livre de recette" est exactement ce qu'il me faut. Mais comme j'ai mentionné plus haut, j'ai trouvé assez aberrant qu'une fois transférés sur ma tablette (Kobo), le livre était complètement illisible.
Price is very affordable! Concepts are clearly explained without having to read too much text and only simple codes are provided to point out the basic principles involved.

Recommended For You

C++17 STL Cookbook

Over 90 recipes that leverage the powerful features of the Standard Library in C++17

By Jacek Galowicz
Mastering C++ Multithreading

Master multithreading and concurrent processing with C++

By Maya Posch
Hands-On Design Patterns with C++

A comprehensive guide with extensive coverage on concepts such as OOP, functional programming, generic programming, and STL along with the latest features of C++

By Fedor G. Pikus
Boost C++ Application Development Cookbook - Second Edition

Learn to build applications faster and better by leveraging the real power of Boost and C++

By Antony Polukhin