Modern C++: Efficient and Scalable Application Development

By Richard Grimes , Marius Bancila
    Advance your knowledge in tech with a Packt subscription

  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Understanding Language Features

About this book

C++ is one of the most widely used programming languages. It is fast, flexible, and used to solve many programming problems.

This Learning Path gives you an in-depth and hands-on experience of working with C++, using the latest recipes and understanding most recent developments. You will explore C++ programming constructs by learning about language structures, functions, and classes, which will help you identify the execution flow through code. You will also understand the importance of the C++ standard library as well as memory allocation for writing better and faster programs.

Modern C++: Efficient and Scalable Application Development deals with the challenges faced with advanced C++ programming. You will work through advanced topics such as multithreading, networking, concurrency, lambda expressions, and many more recipes.

By the end of this Learning Path, you will have all the skills to become a master C++ programmer.

This Learning Path includes content from the following Packt products:

  • Beginning C++ Programming by Richard Grimes
  • Modern C++ Programming Cookbook by Marius Bancila
  • The Modern C++ Challenge by Marius Bancila
Publication date:
December 2018
Publisher
Packt
Pages
702
ISBN
9781789951738

 

Chapter 1. Understanding Language Features

In this chapter, you will be diving into the utmost depths and learn the various language features to control the flow in your code.

 

Writing C++


C++ is a very flexible language when it comes to formatting and writing code. It is also a strongly typed language, meaning there are rules about declaring the types of variables, which you can use to your advantage by making the compiler help you write better code. In this section, we will cover how to format C++ code and rules on declaring and scoping variables.

Using whitespace

Other than string literals, you have free usage of white space (spaces, tabs, newlines), and are able to use as much or as little as you like. C++ statements are delimited by semicolons, so in the following code there are three statements, which will compile and run:

    int i = 4; 
    i = i / 2; 
    std::cout << "The result is" << i << std::endl;

The entire code could be written as follows:

    int i=4;i=i/2; std::cout<<"The result is "<<i<<std::endl;

There are some cases where whitespace is needed (for example, when declaring a variable you must have white space between the type and the variable name), but the convention is to be as judicious as possible to make the code readable. And while it is perfectly correct, language-wise, to put all the statements on one line (like JavaScript), it makes the code almost completely unreadable.

Note

If you are interested in some of the more creative ways of making code unreadable, have a look at the entries for the annual International Obfuscated C Code Contest (http://www.ioccc.org/). As the progenitor of C++, many of the lessons in C shown at IOCCC apply to C++ code too.

Bear in mind that, if the code you write is viable, it may be in use for decades, which means you may have to come back to the code years after you have written it, and it means that other people will support your code, too. Making your code readable is not only a courtesy to other developers, but unreadable code is always a likely target for replacement.

Formatting Code

Inevitably, whoever you are writing code for will dictate how you format code. Sometimes it makes sense, for example, if you use some form of preprocessing to extract code and definitions to create documentation for the code. In many cases, the style that is imposed on you is the personal preference of someone else.

Note

Visual C++ allows you to place XML comments in your code. To do this you use a three--slash comment (///) and then compile the source file with the /doc switch. This creates an intermediate XML file called an xdc file with a <doc> root element and containing all the three--slash comments. The Visual C++ documentation defines standard XML tags (for example, <param>, <returns> to document the parameters and return value of a function). The intermediate file is compiled to the final document XML file with the xdcmake utility.

There are two broad styles in C++: K&R and Allman.

Kernighan and Ritchie (K&R) wrote the first, and most influential book about C (Dennis Ritchie was the author of the C language). The K&R style is used to describe the formatting style used in that book. In general, K&R places the opening brace of a code block on the same line of the last statement. If your code has nested statements (and typically, it will) then this style can get a bit confusing:

    if (/* some test */) { 
        // the test is true  
        if (/* some other test */) { 
            // second test is true  
        } else { 
            // second test is false    
        } 
    } else { 
        // the test is false  
    }

This style is typically used in Unix (and Unix-like) code.

The Allman style (named after the developer Eric Allman) places the opening brace on a new line, so the nested example looks as follows:

        if (/* some test */)  
        { 
            // the test is true  
            if (/* some other test */)  
            { 
                // second test is true   
            }  
            else  
            { 
                // second test is false     
            } 
        }  
        else  
        { 
           // the test is false  
        }

The Allman style is typically used by Microsoft.

Remember that your code is unlikely to be presented on paper, so the fact that K&R is more compact will save no trees. If you have the choice, you should choose the style that is the most readable; the decision of this author, for this book, is that Allman is more readable.

If you have multiple nested blocks, the indents can give you an idea of which block the code resides in. However, comments can help. In particular, if a code block has a large amount of code, it is often helpful to comment the reason for the code block. For example, in an if statement, it is helpful to put the result of the test in the code block so you know what the variable values are in that block. It is also useful to put a comment on the closing brace of the test:

    if (x < 0)  
    { 
       // x < 0 
       /* lots of code */ 
    }  // if (x < 0) 

    else  
    { 
       // x >= 0 
       /* lots of code */ 
    }  // if (x < 0)

If you put the test as a comment on a closing brace, it means that you have a search term that you can use to find the test that resulted in the code block. The preceding lines make this commenting redundant, but when you have code blocks with many tens of lines of code, and with many levels of nesting, comments like this can be very helpful.

Writing Statements

A statement can be a declaration of a variable, an expression that evaluates to a value, or it can be a definition of a type. A statement may also be a control structure to affect the flow of the execution through your code.

A statement ends with a semicolon. Other than that, there are few rules about how to format statements. You can even use a semicolon on its own, and this is called a null statement. A null statement does nothing, so having too many semicolons is usually benign.

Working with Expressions

An expression is a sequence of operators and operands (variables or literals) that results in some value. Consider the following:

    int i; 
    i = 6 * 7;

On the right side 6 * 7 is an expression, and the assignment (from i on the left-hand side to the semicolon on the right) is a statement.

Every expression is either an lvalue or an rvalue. You are most likely to see these keywords used in error descriptions. In effect, an lvalue is an expression that refers to some memory location. Items on the left-hand side of an assignment must be lvalues. However, an lvalue can appear on the left- or right-hand side of an assignment. All variables are lvalues. An rvalue is a temporary item that does not exist longer than the expression that uses it; it will have a value, but cannot have a value assigned to it, so it can only exist on the right-hand side of an assignment. Literals are rvalues. The following shows a simple example of lvalues and rvalues:

    int i; 
    i = 6 * 7;

In the second line, i is an lvalue, and the expression 6 * 7 results in an rvalue (42). The following will not compile because there is an rvalue on the left:

    6 * 7 = i;

Broadly speaking, an expression becomes a statement by when you append a semicolon. For example, the following are both statements:

    42;
    std::sqrt(2);

The first line is an rvalue of 42, but since it is temporary it has no effect. A C++ compiler will optimize it away. The second line calls the standard library function to calculate the square root of 2. Again, the result is an rvalue and the value is not used, so the compiler will optimize this away. However, it illustrates that a function can be called without using its return value. Although it is not the case with std::sqrt, many functions have a lasting effect other than their return value. Indeed, the whole point of a function is usually to do something, and the return value is often used merely to indicate if the function was successful; often developers assume that a function will succeed and ignore the return value.

Using the Comma Operator

Operators will be covered later in this chapter; however, it is useful to introduce the comma operator here. You can have a sequence of expressions separated by a comma as a single statement. For example, the following code is legal in C++:

    int a = 9;
    int b = 4;
    int c;
    c = a + 8, b + 1;

The writer intended to type c = a + 8 / b + 1; and : they pressed comma instead of a /. The intention was for c to be assigned to 9 + 2 + 1, or 12. This code will compile and run, and the variable c will be assigned with a value of 17 (a + 8). The reason is that the comma separates the right-hand side of the assignment into two expressions, a + 8 and b + 1, and it uses the value of the first expression to assign c. Later in this chapter, we will look at operator precedence. However, it is worth saying here that the comma has the lowest precedence and + has a higher precedence than =, so the statement is executed in the order of the addition: the assignment and then the comma operator (with the result of b + 1 thrown away).

You can change the precedence using parentheses to group expressions. For example, the mistyped code could have been as follows:

    c = (a + 8, b + 1);

The result of this statement is: variable c is assigned to 5 (or b + 1). The reason is that with the comma operator expressions are executed from left to right so the value of the group of expressions is the tight-most one. There are some cases, for example, in the initialization or loop expression of a for loop, where you will find the comma operator useful, but as you can see here, even used intentionally, the comma operator produces hard-to-read code.

Using Types and Variables

It is useful to give basic information here. C++ is a strongly typed language, which means that you have to declare the type of the variables that you use. The reason for this is that the compiler needs to know how much memory to allocate for the variable, and it can determine this by the type of the variable. In addition, the compiler needs to know how to initialize a variable, if it has not been explicitly initialized, and to perform this initialization the compiler needs to know the type of the variable.

Note

C++11 provides the auto keyword, which relaxes this concept of strong typing. However, the type checking of the compiler is so important that you should use type checking as much as possible.

C++ variables can be declared anywhere in your code as long as they are declared before they are used. Where you declare a variable determines how you use it (this is called the scope of the variable). In general, it is best to declare the variable as close as possible to where you will use it, and within the most restrictive scope. This prevents name clashes, where you will have to add additional information to disambiguate two or more variables.

You may, and should, give your variables descriptive names. This makes your code much more readable and easier to understand. C++ names must start with an alphabetic character, or an underscore. They can contain alphanumeric characters except spaces, but can contain underscores. So, the following are valid names:

    numberOfCustomers 
    NumberOfCustomers 
    number_of_customers

C++ names are case-sensitive, and the first 2,048 characters are significant. You can start a variable name with an underscore, but you cannot use two underscores, nor can you use an underscore followed by a capital letter (these are reserved by C++). C++ also reserves keywords (for example, while and if), and clearly you cannot use type names as variable names, neither built in type names (int, long, and so on) nor your own custom types.

You declare a variable in a statement, ending with a semicolon. The basic syntax of declaring a variable is that you specify the type, then the name, and, optionally, any initialization of the variable.

Built-in types must be initialized before you use them:

    int i; 
    i++;           // C4700 uninitialized local variable 'i' used 
    std::cout << i;

There are essentially three ways to initialize variables. You can assign a value, you can call the type constructor (constructors for classes will be defined in Chapter 4, Classes) or you can initialize a variable using function syntax:

    int i = 1; 
    int j = int(2); 
    int k(3);

These three are all legal C++, but stylistically the first is the better because it is more obvious: the variable is an integer, it is called i, and it is assigned a value of 1. The third looks confusing; it looks like the declaration of a function when it is actually declaring a variable. 

Chapter 4, Classes will cover classes, your own custom types. A custom type may be defined to have a default value, which means that you may decide not to initialize a variable of a custom type before using it. However, this will result in poorer performance, because the compiler will initialize the variable with the default value and subsequently your code will assign a value, resulting in an assignment being performed twice.

Using constants and literals

Each type will have a literal representation. An integer will be a numeric represented without a decimal point and, if it is a signed integer, the literal can also use the plus or minus symbol to indicate the sign. Similarly, a real number can have a literal value that contains a decimal point, and you may even use the scientific (or engineering) format including an exponent. C++ has various rules to use when specifying literals in code. Some examples of literals are shown here:

    int pos = +1; 
    int neg = -1; 
    double micro = 1e-6; 
    double unit = 1.; 
    std::string name = "Richard";

Note that for the unit variable, the compiler knows that the literal is a real number because the value has a decimal point. For integers, you can provide a hexadecimal literal in your code by prefixing the number with 0x, so 0x100 is 256 in decimal. By default, the output stream will print numeric values in base 10; however, you can insert a manipulator into an output stream to tell it to use a different number base. The default behavior is std::dec, which means the numbers should be displayed as base 10, std::oct means display as octal (base 8), and std::hex means display as hexadecimal (base 16). If you prefer to see the prefix printed, then you use the stream manipulator std::showbase (more details will be given in Chapter 5, Using the Standard Library Containers).

C++ defines some literals. For bool, the logic type, there are true and false constants, where false is zero and true is 1. There is also the nullptr constant, again, zero, which is used as an invalid value for any pointer type.

Defining constants

In some cases, you will want to provide constant values that can be used throughout your code. For example, you may decide to declare a constant for π. You should not allow this value to be changed because it will change the underlying logic in your code. This means that you should mark the variable as being constant. When you do this, the compiler will check the use of the variable and if it is used in code that changes the value of the variable the compiler will issue an error:

    const double pi = 3.1415; 
    double radius = 5.0; 
    double circumference = 2 * pi * radius;

In this case the symbol pi is declared as being constant, so it cannot change. If you subsequently decide to change the constant, the compiler will issue an error:

    // add more precision, generates error C3892 
    pi += 0.00009265359;

Once you have declared a constant, you can be assured that the compiler will make sure it remains so. You can assign a constant with an expression as follows:

    #include <cmath> 
    const double sqrtOf2 = std::sqrt(2);

In this code, a global constant called sqrtOf2 is declared and assigned with a value using the std::sqrt function. Since this constant is declared outside a function, it is global to the file and can be used throughout the file.

The problem with this approach is that the preprocessor does a simple replacement. With constants declared with const, the C++ compiler will perform type checking to ensure that the constant is being used appropriately.

You can also use const to declare a constant that will be used as a constant expression. For example, you can declare an array using the square bracket syntax (more details will be given in Chapter 2, Working with Memory, Arrays, and Pointers):

    int values[5];

This declares an array of five integers on the stack and these items are accessed through the values array variable. The 5 here is a constant expression. When you declare an array on the stack, you have to provide the compiler with a constant expression so it knows how much memory to allocate and this means the size of the array must be known at compile time. (You can allocate an array with a size known only at runtime, but this requires dynamic memory allocation, explained in Chapter 2, Working with Memory, Arrays, and Pointers.) In C++, you can declare a constant to do the following:

    const int size = 5;  
    int values[size];

Elsewhere in your code, when you access the values array, you can use the size constant to make sure that you do not access items past the end of the array. Since the size variable is declared in just one place, if you need to change the size of the array at a later stage, you have just one place to make this change. The const keyword can also be used on pointers and references (see Chapter 2, Working with Memory, Arrays, and Pointers) and on objects (see Chapter 4, Classes); often, you'll see it used on parameters to functions (see Chapter 3, Using Functions). This is used to get the compiler to help ensure that pointers, references, and objects are used appropriately, as you intended.

Using Constant Expressions

C++11 introduces a keyword called constexpr. This is applied to an expression, and indicates that the expression should be evaluated at compile type rather than at runtime:

    constexpr double pi = 3.1415; 
    constexpr double twopi = 2 * pi;

This is similar to initializing a constant declared with the const keyword. However, the constexpr keyword can also be applied to functions that return a value that can be evaluated at compile time, and so this allows the compiler to optimize the code:

    constexpr int triang(int i) 
    { 
       return (i == 0) ? 0 : triang(i - 1) + i;
    }

In this example, the function triang calculates triangular numbers recursively. The code uses the conditional operator. In the parentheses, the function parameter is tested to see if it is zero, and if so the function returns zero, in effect ending the recursion and returning the function to the original caller. If the parameter is not zero, then the return value is the sum of the parameter and the return value of triang called with the parameter is decremented.

 

This function, when called with a literal in your code, can be evaluated at compile time. The constexpr is an indication to the compiler to check the usage of the function to see if it can determine the parameter at compile time. If this is the case, the compiler can evaluate the return value and produce code more efficiently than by calling the function at runtime. If the compiler cannot determine the parameter at compile-time, the function will be called as normal. A function marked with the constexpr keyword must only have one expression (hence the use of the conditional operator ?: in the triang function).

Using Enumerations

A final way to provide constants is to use an enum variable. In effect, an enum is a group of named constants, which means that you can use an enum as a parameter to a function. For example:

    enum suits {clubs, diamonds, hearts, spades};

This defines an enumeration called suits, with named values for the suits in a deck of cards. An enumeration is an integer type and by default the compiler will assume an int, but you can change this by specifying the integer type in the declaration. Since there are just four possible values for card suits, it is a waste of memory to use int (usually 4 bytes) and instead, we can use char (a single byte):

    enum suits : char {clubs, diamonds, hearts, spades};

When you use an enumerated value, you can use just the name; however, it is usual to scope it with the name of the enumeration, making the code more readable:

    suits card1 = diamonds; 
    suits card2 = suits::diamonds;

Both forms are allowed, but the latter makes it more explicit that the value is taken from an enumeration. To force developers to specify the scope, you can apply the keyword class:

    enum class suits : char {clubs, diamonds, hearts, spades};

With this definition and the preceding code, the line declaring card2 will compile, but the line declaring card1 will not. With a scoped enum, the compiler treats the enumeration as a new type and has no inbuilt conversion from your new type to an integer variable. For example:

    suits card = suits::diamonds; 
    char c = card + 10; // errors C2784 and C2676

 

 

 

 

The enum type is based on char but when you define the suits variable as being scoped (with class) the second line will not compile. If the enumeration is defined as not being scoped (without class) then there is an inbuilt conversion between the enumerated value and char.

By default, the compiler will give the first enumerator a value of 0 and then increment the value for the subsequent enumerators. Thus suits::diamonds will have a value of 1 because it is the second value in suits. You can assign values yourself:

    enum ports {ftp=21, ssh, telnet, smtp=25, http=80};

In this case, ports::ftp has a value of 21, ports::ssh has a value of 22 (21 incremented), ports::telnet is 22, ports::smtp is 25, and ports::http is 80.

Note

Often the point of enumerations is to provide named symbols within your code and their values are unimportant. Does it matter what value is assigned to suits::hearts? The intention is usually to ensure that it is different from the other values. In other cases, the values are important because they are a way to provide values to other functions.

Enumerations are useful in a switch statement (see later) because the named value makes it clearer than using just an integer. You can also use an enumeration as a parameter to a function and hence restrict the values passed via that parameter:

    void stack(suits card) 
    { 
        // we know that card is only one of four values 
    }

Declaring Pointers

Since we are covering the use of variables, it is worth explaining the syntax used to define pointers and arrays because there are some potential pitfalls. Chapter 2, Working with Memory, Arrays, and Pointers, covers this in more detail, so we will just introduce the syntax so that you are familiar with it.

 

 

In C++, you will access memory using a typed pointer. The type indicates the type of the data that is held in the memory that is pointed to. So, if the pointer is an (4 byte) integer pointer, it will point to four bytes that can be used as an integer. If the integer pointer is incremented, then it will point to the next four bytes, which can be used as an integer.

Note

Don't worry if you find pointers confusing at this point. Chapter 2, Working with Memory, Arrays, and Pointers, will explain this in more detail. The purpose of introducing pointers at this time is to make you aware of the syntax.

In C++, pointers are declared using the * symbol and you access a memory address with the & operator:

    int *p; 
    int i = 42; 
    p = &i;

The first line declares a variable, p, which will be used to hold the memory address of an integer. The second line declares an integer and assigns it a value. The third line assigns a value to the pointer p to be the address of the integer variable just declared. It is important to stress that the value of p is not42; it will be a memory address where the value of 42 is stored.

Note how the declaration has the * on the variable name. This is common convention. The reason is that if you declare several variables in one statement, the * applies only to the immediate variable. So, for example:

    int* p1, p2;

Initially, this looks like you are declaring two integer pointers. However, this line does not do this; it declares just one pointer to integer called p1. The second variable is an integer called p2. The preceding line is equivalent to the following:

    int *p1;  
    int p2;

If you wish to declare two integers in one statement, then you should do it as follows:

    int *p1, *p2;

Using Namespaces

Namespaces give you one mechanism to modularize code. A namespace allows you to label your types, functions, and variables with a unique name so that, using the scope resolution operator, you can give a fully qualified name. The advantage is that you know exactly which item will be called. The disadvantage is that using a fully qualified name you are in effect switching off C++'s argument-dependent lookup mechanism for overloaded functions where the compiler will choose the function that has the best fit according to the arguments passed to the function.

Defining a namespace is simple: you decorate the types, functions, and global variables with the namespace keyword and the name you give to it. In the following example, two functions are defined in the utilities namespace:

    namespace utilities 
    { 
        bool poll_data() 
        { 
            // code that returns a bool 
        } 
        int get_data() 
        { 
            // code that returns an integer 
        } 
    }

Note

Do not use semicolon after the closing bracket.

Now when you use these symbols, you need to qualify the name with the namespace:

    if (utilities::poll_data()) 
    { 
        int i = utilities::get_data(); 
        // use i here... 
    }

The namespace declaration may just declare the functions, in which case the actual functions would have to be defined elsewhere, and you will need to use a qualified name:

    namespace utilities 
    { 
        // declare the functions 
        bool poll_data(); 
        int get_data(); 
    } 

    //define the functions 
    bool utilities::poll_data() 
    { 
        // code that returns a bool 
    } 

    int utilities::get_data() 
    { 
       // code that returns an integer 
    }

One use of namespaces is to version your code. The first version of your code may have a side-effect that is not in your functional specification and is technically a bug, but some callers will use it and depend on it. When you update your code to fix the bug, you may decide to allow your callers the option to use the old version so that their code does not break. You can do this with a namespace:

    namespace utilities 
    { 
        bool poll_data(); 
        int get_data(); 

        namespace V2 
        { 
            bool poll_data(); 
            int get_data(); 
            int new_feature(); 
        } 
    }

Now callers who want a specific version can call the fully qualified names, for example, callers could use utilities::V2::poll_data to use the newer version and utilities::poll_data to use the older version. When an item in a specific namespace calls an item in the same namespace, it does not have to use a qualified name. So, if the new_feature function calls get_data, it will be utilities::V2::get_data that is called. It is important to note that, to declare a nested namespace, you have to do the nesting manually (as shown here); you cannot simply declare a namespace called utilities::V2.

The preceding example has been written so that the first version of the code will call it using the namespace utilities. C++11 provides a facility called an inline namespace that allows you to define a nested namespace, but allows the compiler to treat the items as being in the parent namespace when it performs an argument-dependent lookup:

    namespace utilities 
    { 
        inline namespace V1 
        { 
            bool poll_data(); 
            int get_data(); 
        } 

        namespace V2 
        { 
            bool poll_data(); 
            int get_data(); 
            int new_feature(); 
        } 
    }

Now to call the first version of get_data, you can use utilities::get_data or utilities::V1::get_data.

Fully qualified names can make the code difficult to read, especially if your code will only use one namespace. To help here you have several options. You can place a using statement to indicate that symbols declared in the specified namespace can be used without a fully qualified name:

    using namespace utilities; 
    int i = get_data(); 
    int j = V2::get_data();

You can still use fully qualified names, but this statement allows you to ease the requirement. Note that a nested namespace is a member of a namespace, so the preceding using statement means that you can call the second version of get_data with either utilities::V2::get_data or V2::get_data. If you use the unqualified name, then it means that you will call utilities::get_data.

A namespace can contain many items, and you may decide that you only want to relax the use of fully qualified names with just a few of them. To do this, use using and give the name of the item:

    using std::cout; 
    using std::endl; 
    cout << "Hello, World!" << endl;

This code says that, whenever cout is used, it refers to std::cout. You can use using within a function, or you can put it as file scope and make the intention global to the file.

You do not have to declare a namespace in one place, you can declare it over several files. The following could be in a different file to the previous declaration of utilities:

    namespace utilities 
    { 
        namespace V2 
        { 
            void print_data(); 
        } 
    }

The print_data function is still part of the utilities::V2 namespace.

You can also put an #include in a namespace, in which case the items declared in the header file will now be part of the namespace. The standard library header files that have a prefix of c (for example, cmath, cstdlib, and ctime) give access to the C runtime functions by including the appropriate C header in the std namespace.

The great advantage of a namespace is to be able to define your items with names that may be common, but are hidden from other code that does not know the namespace name of. The namespace means that the items are still available to your code via the fully qualified name. However, this only works if you use a unique namespace name, and the likelihood is that, the longer the namespace name, the more unique it is likely to be. Java developers often name their classes using a URI, and you could decide to do the same thing:

    namespace com_packtpub_richard_grimes 
    { 
        int get_data(); 
    }

The problem is that the fully qualified name becomes quite long:

    int i = com_packtpub_richard_grimes::get_data();

You can get around this issue using an alias:

    namespace packtRG = com_packtpub_richard_grimes; 
    int i = packtRG::get_data();

C++ allows you to define a namespace without a name, an anonymous namespace. As mentioned previously, namespaces allow you to prevent name clashes between code defined in several files. If you intend to use such a name in only one file you could define a unique namespace name. However, this could get tedious if you had to do it for several files. A namespace without a name has the special meaning that it has internal linkage, that is, the items can only be used in the current translation unit, the current file, and not in any other file.

Code that is not declared in a namespace will be a member of the global namespace. You can call the code without a namespace name, but you may want to explicitly indicate that the item is in the global namespace using the scope resolution operator without a namespace name:

    int version = 42; 

    void print_version() 
    { 
        std::cout << "Version = " << ::version << std::endl; 
    }

C++ Scoping of Variables

The compiler will compile your source files as individual items called translation units. The compiler will determine the objects and variables you declare and the types and functions you define, and once declared you can use any of these in the subsequent code within the scope of the declaration. At its very broadest, you can declare an item at the global scope by declaring it in a header file that will be used by all of the source files in your project. If you do not use a namespace it is often wise when you use such global variables to name them as being part of the global namespace:

    // in version.h 
    extern int version; 

    // in version.cpp 
    #include "version.h"  
    version = 17; 

    // print.cpp 
    #include "version.h" 
    void print_version() 
    { 
        std::cout << "Version = " << ::version << std::endl; 
    }

This code has the C++ for two source files (version.cpp and print.cpp) and a header file (version.h) included by both source files. The header file declares the global variable version, which can be used by both source files; it declares the variable, but does not define it. The actual variable is defined and initialized in version.cpp; it is here that the compiler will allocate memory for the variable. The extern keyword used on the declaration in the header indicates to the compiler that version has external linkage, that is, the name is visible in files other than where the variable is defined. The version variable is used in the print.cpp source file. In this file, the scope resolution operator (::) is used without a namespace name and hence indicates that the variable version is in the global namespace.

You can also declare items that will only be used within the current translation unit, by declaring them within the source file before they are used (usually at the top of the file). This produces a level of modularity and allows you to hide implementation details from code in other source files. For example:

    // in print.h 
    void usage(); 

    // print.cpp 
    #include "version.h" 
    std::string app_name = "My Utility"; 
    void print_version() 
    { 
       std::cout << "Version = " << ::version << std::endl; 
    } 

    void usage() 
    { 
       std::cout << app_name << " "; 
       print_version(); 
    }

The print.h header contains the interface for the code in the file print.cpp. Only those functions declared in the header will be callable by other source files. The caller does not need to know about the implementation of the usage function, and as you can see here it is implemented using a call to a function called print_version that is only available to code in print.cpp. The variable app_name is declared at file scope, so it will only be accessible to code in print.cpp.

If another source file declares a variable at file scope, that is called app_name, and is also a std::string the file will compile, but the linker will complain when it tries to link the object files. The reason is that the linker will see the same variable defined in two places and it will not know which one to use.

A function also defines a scope; variables defined within the function can only be accessed through that name. The parameters of the function are also included as variables within the function, so when you declare other variables, you have to use different names. If a parameter is not marked as const then you can alter the value of the parameter in your function.

You can declare variables anywhere within a function as long as you declare them before you use them. Curly braces ({}) are used to define code blocks, and they also define local scope; if you declare a variable within a code block then you can only use it there. This means that you can declare variables with the same name outside the code block and the compiler will use the variable closest to the scope it is accessed.

Before finishing this section, it is important to mention one aspect of the C++ storage class. A variable declared in a function means that the compiler will allocate memory for the variable on the stack frame created for the function. When the function finishes, the stack frame is torn down and the memory recycled. This means that, after a function returns, the values in any local variables are lost; when the function is called again, the variable is created anew and initialized again.

C++ provides the static keyword to change this behavior. The static keyword means that the variable is allocated when the program starts just like variables declared at global scope. Applying static to a variable declared in a function means that the variable has internal linkage, that is, the compiler restricts access to that variable to that function:

    int inc(int i) 
    { 
        static int value; 
        value += i; 
        return value; 
    } 

    int main() 
    { 
        std::cout << inc(10) << std::endl; 
        std::cout << inc(5) << std::endl; 
    }

By default, the compiler will initialize a static variable to 0, but you can provide an initialization value, and this will be used when the variable is first allocated. When this program starts, the value variable will be initialized to 0 before the main function is called. The first time the inc function is called, the value variable is incremented to 10, which is returned by the function and printed to the console. When the inc function returns the value variable is retained, so that when the inc function is called again, the value variable is incremented by 5 to a value of 15.

 

Using Operators


Operators are used to compute a value from one or more operands. The following table groups all of the operators with equal precedence and lists their associativity. The higher in the table, the higher precedence of execution the operator has in an expression. If you have several operators in an expression, the compiler will perform the higher-precedence operators before the lower-precedence operators. If an expression contains operators of equal precedence, then the compiler will use the associativity to decide whether an operand is grouped with the operator to its left or right.

Note

There are some ambiguities in this table. A pair of parentheses can mean a function call or a cast and in the table these are listed as function() and cast(); in your code you will simply use (). The + and - symbols are either used to indicate sign (unary plus and unary minus, given in the table as +x and -x), or addition and subtraction (given in the table as + and -). The & symbol means either "take the address of" (listed in the table as &x) or bitwise AND (listed in the table as &). Finally, the postfix increment and decrement operators (listed in the table as x++ and x--) have a higher precedence than the prefix equivalents (listed as ++x and --x).

Precedence and Associativity

Operators

1: No associativity

::

2: Left to right associativity

. or -> [] function() {} x++ x-- typeid const_cast dynamic_cast reinterpret_cast static_cast

3: Right to left associativity

sizeof ++x --x ~ ! -x +x &x * new delete cast()

4: Left to right associativity

.* or ->*

5: Left to right associativity

* / %

6: Left to right associativity

+ -

7: Left to right associativity

<< >>

8: Left to right associativity

< > <= >=

9: Left to right associativity

== !=

10: Left to right associativity

&

11: Left to right associativity

^

12: Left to right associativity

|

13: Left to right associativity

&&

14: Left to right associativity

||

15: Right to left associativity

? :

16: Right to left associativity

= *= /= %= += -= <<= >>= &= |= ^=

17: Right to left associativity

throw

18: Left to right associativity

,

 

For example, take a look at the following code:

    int a = b + c * d;

This is interpreted as the multiplication being performed first, and then the addition. A clearer way to write the same code is:

    int a = b + (c * d);

The reason is that * has a higher precedence than + so that the multiplication is carried out first, and then the addition is performed:

    int a = b + c + d;

In this case, the + operators have the same precedence, which is higher than the precedence of assignment. Since + has left to right associativity the statement is interpreted as follows:

    int a = ((b + c) + d);

That is, the first action is the addition of b and c, and the result is added to d and it is this result that is used to assign a. This may not seem important, but bear in mind that the addition could be between function calls (a function call has a higher precedence than +):

    int a = b() + c() + d();

This means that the three functions are called in the order b, c, d, and then their return values are summed according to the left-to-right associativity. This may be important because d may depend on global data altered by the other two functions.

It makes your code more readable and easier to understand if you explicitly specify the precedence by grouping expressions with parentheses. Writing b + (c * d) makes it immediately clear which expression is executed first, whereas b + c * d means you have to know the precedence of each operator.

The built-in operators are overloaded, that is, the same syntax is used regardless of which built-in type is used for the operands. The operands must be the same type; if different types are used, the compiler will perform some default conversions, but in other cases (in particular, when operating on types of different sizes), you will have to perform a cast to indicate explicitly what you mean. 

Exploring the Built-in Operators

C++ comes with a wide range of built-in operators; most are arithmetic or logic operators, which will be covered in this section. The memory operators will be covered in Chapter 2, Working with Memory, Arrays, and Pointers, and the object-related operators in Chapter 4, Classes.

Arithmetic Operators

The arithmetic operators +, -, /, *, and % need little explanation other than perhaps the division and modulus operators. All of these operators act upon integer and real numeric types except for %, which can only be used with integer types. If you mix the types (say, add an integer to a floating-point number) then the compiler will perform an automatic conversion. The division operator / behaves as you expect for floating point variables: it produces the result of the division of the two operands. When you perform the division between two integers a / b, the result is the whole number of the divisor (b) in the dividend (a). The remainder of the division is obtained by the modulus %. So, for any integer, b (other than zero), one could say that, an integer a can be expressed as follows:

    (a / b) * b + (a % b)

Note that the modulus operator can only be used with integers. If you want to get the remainder of a floating-point division, use the standard function, std:;remainder.

Be careful when using division with integers, since fractional parts are discarded. If you need the fractional parts, then you may need to explicitly convert the numbers into real numbers. For example:

    int height = 480; 
    int width = 640; 
    float aspect_ratio = width / height;

This gives an aspect ratio of 1 when it should be 1.3333 (or 4 : 3). To ensure that floating-point division is performed, rather than integer division, you can cast either (or both) the dividend or divisor to a floating-point number.

Increment and Decrement Operators

There are two versions of these operators, prefix and postfix. As the name suggests, prefix means that the operator is placed on the left of the operand (for example, ++i), and a postfix operator is placed to the right (i++). The ++ operator will increment the operand and the -- operator will decrement it. The prefix operator means "return the value after the operation," and the postfix operator means "return the value before the operation." So the following code will increment one variable and use it to assign another:

    a = ++b;

Here, the prefix operator is used so the variable b is incremented and the variable a is assigned to the value after b has been incremented. Another way of expressing this is:

    a = (b = b + 1);

The following code assigns a value using the postfix operator:

    a = b++;

This means that the variable b is incremented, but the variable a is assigned to the value before b has been incremented. Another way of expressing this is:

    int t; 
    a = (t = b, b = b + 1, t);

Note

Note that this statement uses the comma operator, so a is assigned to the temporary variable t in the right-most expression.

The increment and decrement operators can be applied to both integer and floating point numbers. The operators can also be applied to pointers, where they have a special meaning. When you increment a pointer variable it means increment the pointer by the size of the type pointed to by the operator.

Bitwise Operators

Integers can be regarded as a series of bits, 0 or 1. Bitwise operators act upon these bits compared to the bit in the same position in the other operand. Signed integers use a bit to indicate the sign, but bitwise operators act on every bit in an integer, so it is usually only sensible to use them on unsigned integers. In the following, all the types are marked as unsigned, so they are treated as not having a sign bit.

The & operator is bitwise AND, which means that each bit in the left-hand operand is compared with the bit in the right-hand operand in the same position. If both are 1, the resultant bit in the same position will be 1; otherwise, the resultant bit is zero:

    unsigned int a = 0x0a0a; // this is the binary 0000101000001010 
    unsigned int b = 0x00ff; // this is the binary 0000000000001111 
    unsigned int c = a & b;  // this is the binary 0000000000001010 
    std::cout << std::hex << std::showbase << c << std::endl;

In this example, using bitwise & with 0x00ff has the same effect as providing a mask that masks out all but the lowest byte.

The bitwise OR operator | will return a value of 1 if either or both bits in the same position are 1, and a value of 0 only if both are 0:

    unsigned int a = 0x0a0a; // this is the binary 0000101000001010 
    unsigned int b = 0x00ff; // this is the binary 0000000000001111 
    unsigned int c = a & b;  // this is the binary 0000101000001111 
    std::cout << std::hex << std::showbase << c << std::endl;

One use of the & operator is to find if a particular bit (or a specific collection of bits) is set:

    unsigned int flags = 0x0a0a; // 0000101000001010 
    unsigned int test = 0x00ff;  // 0000000000001111 

    // 0000101000001111 is (flags & test) 
    if ((flags & test) == flags)  
    { 
        // code for when all the flags bits are set in test 
    } 
    if ((flags & test) != 0) 
    { 
        // code for when some or all the flag bits are set in test  
    }

The flags variable has the bits we require, and the test variable is a value that we are examining. The value (flags & test) will have only those bits in the test variables that are also set in flags. Thus, if the result is non-zero, it means that at least one bit in test is also set in flags; if the result is exactly the same as the flags variable then all the bits in flags are set in test.

The exclusive OR operator ^ is used to test when the bits are different; the resultant bit is 1 if the bits in the operands are different, and 0 if they are the same. Exclusive OR can be used to flip specific bits:

    int value = 0xf1; 
    int flags = 0x02; 
    int result = value ^ flags; // 0xf3 
    std::cout << std::hex << result << std::endl;

The final bitwise operator is the bitwise complement ~. This operator is applied to a single integer operand and returns a value where every bit is the complement of the corresponding bit in the operand; so if the operand bit is 1, the bit in the result is 0, and if the bit in the operand is 0, the bit in the result is 1. Note that all bits are examined, so you need to be aware of the size of the integer.

Boolean Operators

The == operator tests whether two values are exactly the same. If you test two integers then the test is obvious; for example, if x is 2 and y is 3, then x == y is obviously false. However, two real numbers may not be the same even when you think so:

    double x = 1.000001 * 1000000000000; 
    double y = 1000001000000; 
    if (x == y) std::cout << "numbers are the same";

The double type is a floating-point type held in 8 bytes, but this is not enough for the precision being used here; the value stored in the x variable is 1000000999999.9999 (to four decimal places).

The != operator tests if two values are not true. The operators > and <, test two values to see if the left-hand operand is greater than, or less than, the right-hand operand, the >= operator tests if the left-hand operand is greater than or equal to the right-hand operand, and the <= operator tests if the left-hand operand is less than or equal to the right-hand operand. These operators can be used in the if statement similar to how == is used in the preceding example. The expressions using the operators return a value of type bool and so you can use them to assign values to Boolean variables:

    int x = 10; 
    int y = 11; 
    bool b = (x > y); 
    if (b) std::cout << "numbers same"; 
    else   std::cout << "numbers not same";

The assignment operator (=) has a higher precedence than the greater than (>=) operator, but we have used the parentheses to make it explicit that the value is tested before being used to assign the variable. You can use the ! operator to negate a logical value. So, using the value of b obtained previously, you can write the following:

    if (!b) std::cout << "numbers not same"; 
    else    std::cout << "numbers same";

You can combine two logical expressions using the && (AND) and || (OR) operators. An expression with the && operator is true only if both operands are true, whereas an expression with the || operator is true if either, or both, operands are true:

    int x = 10, y = 10, z = 9; 
    if ((x == y) || (y < z)) 
        std::cout << "one or both are true";

This code involves three tests; the first tests if the x and y variables have the same value, the second tests if the variable y is less than z, and then there is a test to see if either or both of the first two tests are true.

In a || expression such as this, where the first operand (x==y) is true, the total logical expression will be true regardless of the value of the right operand (here, y < z). So there is no point in testing the second expression. Correspondingly, in an && expression, if the first operand is false then the entire expression must be false, and so the right-hand part of the expression need not be tested.

 

The compiler will provide code to perform this short-circuiting for you:

    if ((x != 0) && (0.5 > 1/x))  
    { 
        // reciprocal is less than 0.5 
    }

This code tests to see if the reciprocal of x is less than 0.5 (or, conversely, that x is greater than 2). If the x variable has value 0 then the test 1/x is an error but, in this case, the expression will never be executed because the left operand to && is false.

Bitwise Shift Operators

Bitwise shift operators shift the bits in the left-hand operand integer the specified number of bits given in the right-hand operand, in the specified direction. A shift by one bit left multiplies the number by two, a shift one bit to the right divides by 2. In the following a 2-byte integer is bit-shifted:

    unsigned short s1 = 0x0010; 
    unsigned short s2 = s1 << 8; 
    std::cout << std::hex << std::showbase; 
    std::cout << s2 << std::endl; 
    // 0x1000  
    s2 = s2 << 3; 
    std::cout << s2 << std::endl; 
    // 0x8000

In this example, the s1 variable has the fifth bit set (0x0010 or 16). The s2 variable has this value, shifted left by 8 bits, so the single bit is shifted to the 13th bit, and the bottom 8 bits are all set to 0 (0x10000 or 4,096). This means that 0x0010 has been multiplied by 28, or 256, to give 0x1000. Next, the value is shifted left by another 3 bits, and the result is 0x8000; the top bit is set.

The operator discards any bits that overflow, so if you have the top bit set and shift the integer one bit left, that top bit will be discarded:

    s2 = s2 << 1; 
    std::cout << s2 << std::endl; 
    // 0

A final shift left by one bit results in a value 0.

It is important to remember that, when used with a stream, the operator << means insert into the stream, and when used with integers, it means bitwise shift.

Assignment Operators

The assignment operator = assigns an lvalue (a variable) on the left with the result of the rvalue (a variable or expression) on the right:

    int x = 10; 
    x = x + 10;

The first line declares an integer and initializes it to 10. The second line alters the variable by adding another 10 to it, so now the variable x has a value of 20. This is the assignment. C++ allows you to change the value of a variable based on the variable's value using an abbreviated syntax. The previous lines can be written as follows:

    int x = 10; 
    x += 10;

An increment operator such as this (and the decrement operator) can be applied to integers and floating-point types. If the operator is applied to a pointer, then the operand indicates how many whole items addresses the pointer is changed by. For example, if an int is 4 bytes and you add 10 to an int pointer, the actual pointer value is incremented by 40 (10 times 4 bytes).

In addition to the increment (+=) and decrement (-=) assignments, you can have assignments for multiply (*=), divide (/=), and remainder (%=). All of these except for the last one (%=) can be used for both floating-point types and integers. The remainder assignment can only be used on integers.

You can also perform bitwise assignment operations on integers: left shift (<<=), right shift (>>=), bitwise AND (&=), bitwise OR (|=), and bitwise exclusive OR (^=). It usually only makes sense to apply these to unsigned integers. So, multiplying by eight can be carried out by both of these two lines:

    i *= 8; 
    i <<= 3;
 

Controlling Execution Flow


C++ provides many ways to test values and loop through code.

Using Conditional Statements

The most frequently used conditional statement is if. In its simplest form, the if statement takes a logical expression in a pair of parentheses and is immediately followed by the statement that is executed if the condition is true:

    int i; 
    std::cin >> i; 
    if (i > 10) std::cout << "much too high!" << std::endl;

You can also use the else statement to catch occasions when the condition is false:

    int i; 
    std::cin >> i; 
    if (i > 10) std::cout << "much too high!" << std::endl; 
    else        std::cout << "within range" << std::endl;

If you want to execute several statements, you can use braces ({}) to define a code block.

The condition is a logical expression and C++ will convert from numeric types to a bool, where 0 is false and anything not 0 is true. If you are not careful, this can be a source of an error that is not only difficult to notice, but also can have an unexpected side-effect. Consider the following code, which asks for input from the console and then tests to see if the user enters -1:

    int i; 
    std::cin >> i; 
    if (i == -1) std::cout << "typed -1" << endl; 
    std::cout << "i = " << i << endl;

This is contrived, but you may be asking for values in a loop and then performing actions on those values, except when the user enters -1, at which point the loop finishes. If you mistype, you may end up with the following code:

    int i; 
    std::cin >> i; 
    if (i = -1) std::cout << "typed -1" << endl; 
    std::cout << "i = " << i << endl;

In this case, the assignment operator (=) is used instead of the equality operator (==). There is just one character difference, but this code is still correct C++ and the compiler is happy to compile it.

The result is that, regardless of what you type at the console, the variable i is assigned to -1, and since -1 is not zero, the condition in the if statement is true, hence the true clause of the statement is executed. Since the variable has been assigned to -1, this may alter logic further on in your code. The way to avoid this bug is to take advantage of the requirement that in an assignment the left-hand side must be an lvalue. Perform your test as follows:

    if (-1 == i) std::cout << "typed -1" << endl;

Here, the logical expression is (-1 == i), and since the == operator is commutative (the order of the operands does not matter; you get the same result), this is exactly the same as you intended in the preceding test. However, if you mistype the operator, you get the following:

    if (-1 = i) std::cout << "typed -1" << endl;

In this case, the assignment has an rvalue on the left-hand side, and this will cause the compiler to issue an error (in Visual C++ this is C2106 '=' : left operand must be l-value).

You are allowed to declare a variable in an if statement, and the scope of the variable is in the statement blocks. For example, a function that returns an integer can be called as follows:

    if (int i = getValue()) {    
        // i != 0    // can use i here  
    } else {    
        // i == 0    // can use i here  
    }

While this is perfectly legal C++, there are a few reasons why you would want to do this.

In some cases, the conditional operator ?: can be used instead of an if statement. The operator executes the expression to the left of the ? operator and, if the conditional expression is true, it executes the expression to the right of the ?. If the conditional expression is false, it executes the expression to the right of the :. The expression that the operator executes provides the return value of the conditional operator.

For example, the following code determines the maximum of two variables, a and b:

    int max; 
    if (a > b) max = a; 
    else       max = b;

This can be expressed with the following single statement:

    int max = (a > b) ? a : b;

The main choice is whichever is most readable in the code. Clearly, if the assignment expressions are large it may well be best to split them over lines in an if statement. However, it is useful to use the conditional statement in other statements. For example:

    int number;  
    std::cin  >> number; 
    std::cout << "there " 
              << ((number == 1) ? "is " : "are ")  
              << number << " item"            
              << ((number == 1) ? "" : "s") 
              << std::endl;

This code determines if the variable number is 1 and if so it prints on the console there is 1 item. This is because in both conditionals, if the value of the number variable is 1, the test is true and the first expression is used. Note that there is a pair of parentheses around the entire operator. The reason is that the stream << operator is overloaded, and you want the compiler to choose the version that takes a string, which is the type returned by the operator rather than bool, which is the type of the expression (number == 1).

If the value returned by the conditional operator is an lvalue then you can use it on the left-hand side of an assignment. This means that you can write the following, rather odd, code:

    int i = 10, j = 0; 
    ((i < j) ? i : j) = 7; 
    // i is 10, j is 7 

    i = 0, j = 10; 
    ((i < j) ? i : j) = 7; 
    // i is 7, j is 10

The conditional operator checks to see if i is less than j and if so it assigns a value to i; otherwise, it assigns j with that value. This code is terse, but it lacks readability. It is far better in this case to use an if statement.

Selecting

If you want to test to see if a variable is one of several values, using multiple if statements becomes cumbersome. The C++ switch statement fulfills this purpose much better. The basic syntax is shown here:

    int i; 
    std::cin >> i; 
    switch(i) 
    { 
        case 1:  
            std::cout << "one" << std::endl; 
            break; 
        case 2:  
            std::cout << "two" << std::endl; 
            break; 
        default: 
            std::cout << "other" << std::endl; 
    }

Each case is essentially a label as to the specific code to be run if the selected variable is the specified value. The default clause is for values where there exists no case. You do not have to have a default clause, which means that you are testing only for specified cases. The default clause could be for the most common case (in which case, the cases filter out the less likely values) or it could be for exceptional values (in which case, the cases handle the most likely values).

A switch statement can only test integer types (which includes enum), and you can only test for constants. The char type is an integer, and this means that you can use characters in the case items, but only individual characters; you cannot use strings:

    char c; 
    std::cin >> c; 
    switch(c) 
    { 
        case 'a':  
            std::cout << "character a" << std::endl; 
            break; 
        case 'z':   
            std::cout << "character z" << std::endl; 
            break; 
        default: 
            std::cout << "other character" << std::endl; 
    }

The break statement indicates the end of the statements executed for a case. If you do not specify it, execution will fall through and the following case statements will be executed even though they have been specified for a different case:

    switch(i) 
    { 
        case 1:  
            std::cout << "one" << std::endl; 
            // fall thru 
        case 2:  
            std::cout << "less than three" << std::endl; 
            break; 
        case 3:  
            std::cout << "three" << std::endl; 
            break; 
        case 4: 
            break; 
            default: 
            std::cout << "other" << std::endl; 
    }

This code shows the importance of the break statement. A value of 1 will print both one and less than three to the console, because execution falls through to the preceding case, even though that case is for another value.

It is usual to have different code for different cases, so you will most often finish a case with break. It is easy to miss out a break by mistake, and this will lead to unusual behavior. It is good practice to document your code when deliberately missing out the break statement so that you know that if a break is missing, it is likely to be a mistake.

You can provide zero or more statements for each case. If there is more than one statement, they are all executed for that specific case. If you provide no statements (as for case 4 in this example) then it means that no statements will be executed, not even those in the default clause.

The break statement means break out of this code block, and it behaves like this in the loop statements while and for as well. There are other ways that you can break out of a switch. A case could call return to finish the function where the switch is declared; it can call goto to jump to a label, or it can call throw to throw an exception that will be caught by an exception handler outside the switch, or even outside the function.

So far, the cases are in numeric order. This is not a requirement, but it does make the code more readable, and clearly, if you want to fall through the case statements (as in case 1 here), you should pay attention to the order the case items.

If you need to declare a temporary variable in a case handler then you must define a code block using braces, and this will make the scope of the variable localized to just that code block. You can, of course, use any variable declared outside of the switch statement in any of the case handlers.

Since enumerated constants are integers, you can test an enum in a switch statement:

    enum suits { clubs, diamonds, hearts, spades }; 

    void print_name(suits card) 
    { 
        switch(card) 
        { 
            case suits::clubs: 
                std::cout << "card is a club"; 
                break; 
            default: 
                std::cout << "card is not a club"; 
        } 
    }

Although the enum here is not scoped (it is neither enum class nor enum struct), it is not required to specify the scope of the value in the case, but it makes the code more obvious what the constant refers to.

Looping

Most programs will need to loop through some code. C++ provides several ways to do this, either by iterating with an indexed value or testing a logical condition.

Looping with Iteration

There are two versions of the for statement, iteration and range-based. The latter was introduced in C++11. The iteration version has the following format:

    for (init_expression; condition; loop_expression) 
        loop_statement;

You can provide one or more loop statements, and for more than one statement, you should provide a code block using braces. The purpose of the loop may be served by the loop expression, in which case you may not want a loop statement to be executed; here, you use the null statement, ; which means do nothing.

Within the parentheses are three expressions separated by semicolons. The first expression allows you to declare and initialize a loop variable. This variable is scoped to the for statement, so you can only use it in the for expressions or in the loop statements that follow. If you want more than one loop variable, you can declare them in this expression using the comma operator.

The for statement will loop while the condition expression is true; so if you are using a loop variable, you can use this expression to check the value of the loop variable. The third expression is called at the end of the loop, after the loop statement has been called; following this, the condition expression is called to see if the loop should continue. This final expression is often used to update the value of the loop variable. For example:

    for (int i = 0; i < 10; ++i)   
    { 
        std::cout << i; 
    }

In this code, the loop variable is i and it is initialized to zero. Next, the condition is checked, and since i will be less than 10, the statement will be executed (printing the value to the console). The next action is the loop expression; ++i, is called, which increments the loop variable, i, and then the condition is checked, and so on. Since the condition is i < 10, this means that this loop will run ten times with a value of i between 0 and 9 (so you will see 0123456789 on the console).

The loop expression can be any expression you like, but often it increments or decrements a value. You do not have to change the loop variable value by 1; for example, you can use i -= 5 as the loop expression to decrease the variable by 5 on each loop. The loop variable can be any type you like; it does not have to be integer, it does not even have to be numeric (for example, it could be a pointer, or an iterator object described in Chapter 5, Using the Standard Library Containers), and the condition and loop expression do not have to use the loop variable. In fact, you do not have to declare a loop variable at all!

If you do not provide a loop condition then the loop will be infinite, unless you provide a check in the loop:

for (int i = 0; ; ++i)  
{ 
   std::cout << i << std::endl; 
   if (i == 10) break; 
}

This uses the break statement introduced earlier with the switch statement. It indicates that execution exits the for loop, and you can also use return, goto, or throw. You will rarely see a statement that finishes using goto; however, you may see the following:

for (;;)  
{ 
   // code 
}

In this case, there is no loop variable, no loop expression, and no conditional. This is an everlasting loop, and the code within the loop determines when the loop finishes.

The third expression in the for statement, the loop expression, can be anything you like; the only property is that it is executed at the end of a loop. You may choose to change another variable in this expression, or you can even provide several expressions separated by the comma operator. For example, if you have two functions, one called poll_data that returns true if there is more data available and false when there is no more data, and a function called get_data that returns the next available data item, you could use for as follows (bear in mind; this is a contrived example, to make a point):

for (int i = -1; poll_data(); i = get_data()) 
{ 
   if (i != -1) std::cout << i << std::endl; 
}

When poll_data returns a false value, the loop will end. The if statement is needed because the first time the loop is called, get_data has not yet been called. A better version is as follows:

for (; poll_data() ;) 
{ 
   int i = get_data();  
   std::cout << i << std::endl; 
}

Keep this example in mind for the following section.

There is one other keyword that you can use in a for loop. In many cases, your for loop will have many lines of code and at some point, you may decide that the current loop has completed and you want to start the next loop (or, more specifically, execute the loop expression and then test the condition). To do this, you can call continue:

for (float divisor = 0.f; divisor < 10.f; ++divisor)  
{ 
   std::cout << divisor; 
   if (divisor == 0)  
   {  
      std::cout << std::endl; 
      continue; 
   } 
   std::cout << " " << (1 / divisor) << std::endl; 
}

In this code, we print the reciprocal of the numbers 0 to 9 (0.f is a 4-byte floating-point literal). The first line in the for loop prints the loop variable, and the next line checks to see if the variable is zero. If it is, it prints a new line and continues, that is, the last line in the for loop is not executed. The reason is that the last line prints the reciprocal and it would be an error to divide any number by zero.

C++11 introduces another way to use the for loop, which is intended to be used with containers. The C++ standard library contains templates for container classes. These classes contain collections of objects, and provide access to those items in a standard way. The standard way is to iterate through collections using an iterator object. More details about how to do this will be given in Chapter 5, Using the Standard Library Containers; the syntax requires an understanding of pointers and iterators, so we will not cover them here. The range-based for loop gives a simple mechanism to access items in a container without explicitly using iterators.

The syntax is simple:

for (for_declaration : expression) loop_statement;

The first thing to point out is that there are only two expressions and they are separated by a colon (:). The first expression is used to declare the loop variable, which is of the type of the items in the collection being iterated through. The second expression gives access to the collection.

Note

In C++ terms, the collections that can be used are those that define a begin and end function that gives access to iterators, and also to stack-based arrays (that the compiler knows the size of).

The Standard Library defines a container object called a vector. The vector template is a class that contains items of the type specified in the angle brackets (<>); in the following code, the vector is initialized in a special way that is new to C++11, called list initialization. This syntax allows you to specify the initial values of the vector in a list between curly braces. The following code creates and initializes a vector, and then uses an iteration for loop to print out all the values:

using namespace std; 
vector<string> beatles = { "John", "Paul", "George", "Ringo" }; 

for (int i = 0; i < beatles.size(); ++i)  
{ 
   cout << beatles.at(i) << endl; 
}

Note

Here a using statement is used so that the classes vector and string do not have to be used with fully qualified names.

The vector class has a member function called size (called through the . operator, which means "call this function on this object") that returns the number of items in the vector. Each item is accessed using the at function passing the item's index. The one big problem with this code is that it uses random access, that is, it accesses each item using its index. This is a property of vector, but other Standard Library container types do not have random access. The following uses the range-based for:

vector<string> beatles = { "John", "Paul", "George", "Ringo" }; 

for (string musician : beatles)  
{ 
   cout << musician << endl; 
}

This syntax works with any of the standard container types and for arrays allocated on the stack:

int birth_years[] = { 1940, 1942, 1943, 1940 }; 

for (int birth_year : birth_years)  
{ 
   cout << birth_year << endl; 
}

In this case, the compiler knows the size of the array (because the compiler has allocated the array) and so it can determine the range. The range-based for loop will iterate through all the items in the container, but as with the previous version you can leave the for loop using break, return, throw, or goto, and you can indicate that the next loop should be executed using the continue statement.

Conditional Loops

In the previous section we gave a contrived example, where the condition in the for loop polled for data:

for (; poll_data() ;) 
{ 
   int i = get_data();  
   std::cout << i << std::endl; 
}

In this example, there is no loop variable used in the condition. This is a candidate for the while conditional loop:

while (poll_data()) 
{ 
   int i = get_data();  
   std::cout << i << std::endl; 
}

The statement will continue to loop until the expression (poll_data in this case) has a value of false. As with for, you can exit the while loop with break, return, throw, or goto, and you can indicate that the next loop should be executed using the continue statement.

The first time the while statement is called, the condition is tested before the loop is executed; in some cases you may want the loop executed at least once, and then test the condition (most likely dependent upon the action in the loop) to see if the loop should be repeated. The way to do this is to use the do-while loop:

int i = 5; 
do 
{ 
   std::cout << i-- << std::endl; 
} while (i > 0);

Note the semicolon after the while clause. This is required.

This loop will print 5 to 1 in reverse order. The reason is that the loop starts with i initialized to 5. The statement in the loop decrements the variable through a postfix operator, which means the value before the decrement is passed to the stream. At the end of the loop, the while clause tests to see if the variable is greater than zero. If this test is true, the loop is repeated. When the loop is called with i assigned to 1, the value of 1 is printed to the console and the variable decremented to zero, and the while clause will test an expression that is false and the looping will finish.

The difference between the two types of loop is that the condition is tested before the loop is executed in the while loop, and so the loop may not be executed. In a do-while loop, the condition is called after the loop, which means that, with a do-while loop, the loop statements are always called at least once.

Jumping

C++ supports jumps, and in most cases, there are better ways to branch code; however, for completeness, we will cover the mechanism here. There are two parts to a jump: a labeled statement to jump to and the goto statement. A label has the same naming rules as a variable; it is declared suffixed with a colon, and it must be before a statement. The goto statement is called using the label's name:

    int main() 
    { 
        for (int i = 0; i < 10; ++i) 
        { 
            std::cout << i << std::endl; 
            if (i == 5) goto end; 
        } 

    end:
        std::cout << "end"; 
    }

The label must be in the same function as the calling goto.

Jumps are rarely used, because they encourage you to write non-structured code. However, if you have a routine with highly nested loops or if statements, it may make more sense and be more readable to use a goto to jump to clean up code.

 

Using C++ language features


Let's now use the features you have learned in this chapter to write an application. This example is a simple command-line calculator; you type an expression such as 6 * 7, and the application parses the input and performs the calculation.

Start Visual C++ and click the File menu, and then New, and finally, click on the File... option to get the New File dialog. In the left-hand pane, click on Visual C++, and in the middle pane, click on C++ File (.cpp), and then click on the Open button. Before you do anything else, save this file. Using a Visual C++ console (a command line, which has the Visual C++ environment), navigate to the Beginning_C++ folder and create a new folder called Chapter_02. Now, in Visual C++, on the File menu, click Save Source1.cpp As... and in the Save File As dialog locate the Chapter_02 folder you just created. In the File name box, type calc.cpp and click on the Save button.

The application will use std::cout and std::string; so at the top of the file, add the headers that define these and, so that you do not have to use fully qualified names, add a using statement:

    #include <iostream> 
    #include <string> 

    using namespace std;

You will pass the expression via the command-line, so add a main function that takes command line parameters at the bottom of the file:

    int main(int argc, char *argv[]) 
    { 
    }

The application handles expressions in the form arg1 op arg2 where op is an operator and arg1 and arg2 are the arguments. This means that, when the application is called, it must have four parameters; the first is the command used to start the application and the last three are the expression. The first code in the main function should ensure that the right number of parameters is provided, so at the top of this function add a condition, as follows:

    if (argc != 4) 
    { 
        usage(); 
        return 1; 
    }

If the command is called with more or less than four parameters, a function usage is called, and then the main function returns, stopping the application.

Add the usage function before the main function, as follows:

    void usage() 
    { 
        cout << endl; 
        cout << "calc arg1 op arg2" << endl; 
        cout << "arg1 and arg2 are the arguments" << endl; 
        cout << "op is an operator, one of + - / or *" << endl; 
    }

This simply explains how to use the command and explains the parameters. At this point, you can compile the application. Since you are using the C++ Standard Library, you will need to compile with support for C++ exceptions, so type the following at the command-line:

C:\Beginning_C++Chapter_02\cl /EHsc calc.cpp

If you typed in the code without any mistakes, the file should compile. If you get any errors from the compiler, check the source file to see if the code is exactly as given in the preceding code. You may get the following error:

'cl' is not recognized as an internal or external command,  
operable program or batch file.

This means that the console is not set up with the Visual C++ environment, so either close it down and start the console via the Windows Start menu, or run the vcvarsall.bat batch file. 

Once the code has compiled you may run it. Start by running it with the correct number of parameters (for example, calc 6 * 7), and then try it with an incorrect number of parameters (for example, calc 6 * 7 / 3). Note that the space between the parameters is important:

C:\Beginning_C++Chapter_02>calc 6 * 7 

C:\Beginning_C++Chapter_02>calc 6 * 7 / 3 

calc arg1 op arg2 
arg1 and arg2 are the arguments 
op is an operator, one of + - / or *

In the first case, the application does nothing, so all you see is a blank line. In the second example, the code has determined that there are not enough parameters, and so it prints the usage information to the console.

Next, you need to do some simple parsing of the parameters to check that the user has passed valid values. At the bottom of the main function, add the following:

    string opArg = argv[2]; 
    if (opArg.length() > 1) 
    { 
        cout << endl << "operator should be a single character" << endl; 
        usage(); 
        return 1; 
    }

The first line initializes a C++ std::string object with the third command-line parameter, which should be the operator in the expression. This simple example only allows a single character for the operator, so the subsequent lines check to make sure that the operator is a single character. The C++ std::string class has a member function called length that returns the number of characters in the string.

The argv[2] parameter will have a length of at least one character (a parameter with no length will not be treated as a command-line parameter!), so we have to check if the user typed an operator longer than one character.

Next you need to test to ensure that the parameter is one of the restricted set allowed and, if the user types another operator, print an error and stop the processing. At the bottom of the main function, add the following:

    char op = opArg.at(0); 
    if (op == 44 || op == 46 || op < 42 || op > 47) 
    { 
        cout << endl << "operator not recognized" << endl; 
        usage(); 
        return 1; 
    }

The tests are going to be made on a character, so you need to extract this character from the string object. This code uses the at function, which is passed the index of the character you need. (Chapter 5, Using the Standard Library Containers, will give more details about the members of the std::string class.) The next line checks to see if the character is not supported. The code relies on the following values for the characters that we support:

Character

Value

+

42

*

43

-

45

/

47

 

As you can see, if the character is less than 42 or greater than 47 it will be incorrect, but between 42 and 47 there are two characters that we also want to reject: , (44) and . (46). This is why we have the preceding conditional: "if the character is less than 42 or greater than 47, or it is 44 or 46, then reject it."

The char data type is an integer, which is why the test uses integer literals. You could have used character literals, so the following change is just as valid:

    if (op == ',' || op == '.' || op < '+' || op > '/')
    { 
        cout << endl << "operator not recognized" << endl; 
        usage(); 
        return 1; 
    }

You should use whichever you find the most readable. Since it makes less sense to check whether one character is greater than another, this book will use the former.

At this point, you can compile the code and test it. First try with an operator that is more than one character (for example, **) and confirm that you get the message that the operator should be a single character. Secondly, test with a character that is not a recognized operator; try any character other than +, *, -, or /, but it is also worth trying . and ,.

Bear in mind that the command prompt has special actions for some symbols, such as "&" and "|", and the command prompt may give you an error from it by parsing the command-line before even calling your code.

The next thing to do is to convert the arguments into a form that the code can use. The command-line parameters are passed to the program in an array of strings; however, we are interpreting some of those parameters as floating-point numbers (in fact, double-precision floating-point numbers). The C runtime provides a function called atof, which is available through the C++ Standard Library (in this case, <iostream> includes files that include <cmath>, where atof is declared).

Note

It is a bit counter-intuitive to get access to a math function such as atof through including a file associated with stream input and output. If this makes you uneasy, you can add a line after the include lines to include the <cmath> file. The C++ Standard Library headers have been written to ensure that a header file is only included once, so including <cmath> twice has no ill effect. This was not done in the preceding code, because it was argued that atof is a string function and the code includes the <string> header and, indeed, <cmath> is included via the files the <string> header includes.

Add the following lines to the bottom of the main function. The first two lines convert the second and fourth parameters (remember, C++ arrays are zero-based indexed) to double values. The final line declares a variable to hold the result:

    double arg1 = atof(argv[1]); 
    double arg2 = atof(argv[3]); 
    double result = 0;

Now we need to determine which operator was passed and perform the requested action. We will do this with a switch statement. We know that the op variable will be valid, and so we do not have to provide a default clause to catch the values we have not tested for. Add a switch statement to the bottom of the function:

    double arg1 = atof(argv[1]); 
    double arg2 = atof(argv[3]); 
    double result = 0; 

    switch(op) 
    { 
    }

The first three cases, +, -, and *, are straightforward:

    switch (op) 
    { 
        case '+':
            result = arg1 + arg2;
            break;
        case '-':
            result = arg1 - arg2;
            break;
        case '*':
            result = arg1 * arg2;
            break;
    }

Again, since char is an integer, you can use it in a switch statement, but C++ allows you to check for the character values. In this case, using characters rather than numbers makes the code much more readable.

After the switch, add the final code to print out the result:

    cout << endl; 
    cout << arg1 << " " << op << " " << arg2; 
    cout << " = " << result << endl;

You can now compile the code and test it with calculations that involve +, -, and *.

Division is a problem, because it is invalid to divide by zero. To test this out, add the following lines to the bottom of the switch:

    case '/':
        result = arg1 / arg2;
        break;

Compile and run the code, passing zero as the final parameter:

C:\Beginning_C++Chapter_02>calc 1 / 0
1 / 0 = inf

The code ran successfully, and printed out the expression, but it says that the result is an odd value of inf. What is happening here?

The division by zero assigned result to a value of NAN, which is a constant defined in <math.h> (included via <cmath>), and means "not a number." The double overload of the insertion operator for the cout object tests to see if the number has a valid value, and if the number has a value of NAN, it prints the string inf. In our application, we can test for a zero divisor, and we treat the user action of passing a zero as being an error. Thus, change the code so that it reads as follows:

    case '/': 
    if (arg2 == 0) {
        cout << endl << "divide by zero!" << endl;
        return 1;
    } else {
        result = arg1 / arg2; 
    }
    break;

Now when the user passes zero as a divisor, you will get a divide by zero! message.

You can now compile the full example and test it out. The application supports floating-point arithmetic using the +, -, *, and / operators, and will handle the case of dividing by zero.

 

Summary


In this chapter, you have learned how to format your code, and how to identify expressions and statements. You have learned how to identify the scope of variables, and how to group collections of functions and variables into namespaces so that you can prevent name clashes. You have also learned the basic plumbing in C++ of looping and branching code, and how the built-in operators work. Finally, you put all of this together in a simple application that allows you to perform simple calculations at the command line.

In the following chapter, you will learn about working with memory, arrays, and pointers.

About the Authors

  • Richard Grimes

    Richard Grimes has been programming in C++ for 25 years, working on projects as diverse as scientific control and analysis and finance analysis to remote objects for the automotive manufacturing industry. He has spoken at 70 international conferences on Microsoft technologies (including C++ and C#) and has written 8 books, 150 articles for programming journals, and 5 training courses for Microsoft. Richard was awarded Microsoft MVP for 10 years (1998-2007). He has a reputation for his deep understanding of the .NET framework and C++ and the frank way in which he assesses new technology.

    Browse publications by this author
  • Marius Bancila

    Marius Bancila is a software engineer with 15 years of experience in developing solutions for the industrial and financial sectors. He is the author of Modern C++ Programming Cookbook. He focuses on Microsoft technologies and mainly develops desktop applications with C++ and C#.

    Marius is passionate about sharing his technical expertise with others, and for that reason, he used to be recognized as a Microsoft MVP for more than a decade. Marius can be found on Twitter at @mariusbancila.

    Browse publications by this author
Book Title
Unlock this book and the full library for only $5/m
Access now