You're reading from Expert C++ - Second Edition

Product type Book

Published in Aug 2023

Publisher Packt

ISBN-13 9781804617830

Pages 604 pages

Edition 2nd Edition

Languages

Concepts

Application Development

Authors (5):

Marcelo Guerra Hahn

Araks Tigranyan

John Asatryan

Vardan Grigoryan

Shunguang Wu

View More author details

Table of Contents (24) Chapters

Preface

1. Part 1:Under the Hood of C++ Programming

2. Chapter 1: Building C++ Applications

3. Chapter 2: Beyond Object-Oriented Programming

4. Chapter 3: Understanding and Designing Templates

5. Chapter 4: Template Meta Programming

6. Chapter 5: Memory Management and Smart Pointers

7. Part 2: Designing Robust and Efficient Applications

8. Chapter 6: Digging into Data Structures and Algorithms in STL

9. Chapter 7: Advanced Data Structures

10. Chapter 8: Functional Programming

11. Chapter 9: Concurrency and Multithreading

12. Chapter 10: Designing Concurrent Data Structures

13. Chapter 11: Designing World-Ready Applications

14. Chapter 12: Incorporating Design Patterns in C++ Applications

15. Chapter 13: Networking and Security

16. Chapter 14: Debugging and Testing

17. Chapter 15: Large-Scale Application Design

18. Part 3:C++ in the AI World

19. Chapter 16: Understanding and Using C++ in Machine Learning Tasks

20. Chapter 17: Using C++ in Data Science

21. Chapter 18: Designing and Implementing a Data Analysis Framework

22. Index

Why subscribe?

23. Other Books You May Enjoy

Advanced Data Structures

In the previous chapter, we discussed the importance of knowing what data structures and algorithms are and how to use them in everyday problems. In this chapter, we are going to dive even deeper into what data structures there are, some of which you may have never heard about before.

Knowing about basic data structures is one thing but knowing and understanding how some of the advanced data structures work is a goal every programmer should strive to achieve. But what are advanced data structures and how are they considered to be advanced? We talked briefly about trees and graphs in the previous chapter. Even looking back at their names brings thoughts about those data structures being of an advanced type. They sound so serious; they even look like something solid. And to answer the question you may now have: yes, they are considered to be advanced data structures. Should we just say that, for example, trees are advanced data structures and stop at that...

Technical requirements

The g++ compiler with the -std=c++20 option is used to compile the examples throughout the chapter. You can find the source files used in this chapter in the GitHub repository for this book at https://github.com/PacktPublishing/Expert-C-2nd-edition.

B-trees

A B-tree is a self-balancing tree data structure through which you can organize search, insertion, deletion, and sequential access in logarithmic time. For some operations, the time is not always logarithmic. In the previous chapter, we learned about the time complexity of the std::vector container’s push_back() function. When calculating it, we mentioned that it was amortized, O(1). The same happens for B-trees. Performing deletion and insertion on a B-tree takes amortized O(log n) time. The B-tree is a generalization of the binary search tree that allows nodes to have multiple children. The number of children and keys that a node of a B-tree can hold depends on what order it is in. According to Knuth’s definition, a B-tree of order m is a tree that satisfies the following properties:

Every node has at most m children
Every internal node has at least {m/2} children
Every non-leaf node has at least two children
All leaves appear on the same...

Searching

Let us start with the find function. The idea behind the strategy of finding an element lies in looking at a root node and then, based on the results, going to the left subtree or the right subtree. As you can see in the preceding code, we also have Boolean functions, which will help us implement the find function based on the type of node we are dealing with. A node of a 2-3 tree can be a leaf, a 2-node, or a 3-node and as those nodes have a different number of keys, the implementations of a search function should slightly differ for each type of node.

Before the implementation of the find function, let us first look at a simple illustrative example of looking for an element with a value of 554 in a 2-3 tree:

Figure 7.5 – A 2-3 tree example

First, we look at the root node. We see that its value is 17 and it is not equal to 554. So, we continue looking for the number:

Figure 7.6 – First step to find...

Implementation details of std::unordered_map

In the previous chapter, we discussed std::unordered_map very briefly, saying only that it is the representation of the hash table data structure in C++. At first, along with other hash containers, std::unordered_map wasn’t in the original STL. It was introduced to C++ users only with the TR1 library extension.

std::unordered_map is an associative container that is part of the STL library. It holds key-value pairs with distinctive keys. The time complexity of search, insertion, and removal of items is amortized O(1). This means that all the operations listed are performed in a constant time almost always. In an unordered map, the key value often serves as a means of uniquely identifying each element, and the mapped value is an object connected to that key. Key-value and mapped value types may vary.

Internally, the components are arranged into buckets rather than being sorted in any specific sequence. The hash of an element&...

How std::unordered_map organizes element storing and how elements are inserted into or searched in std::unordered_map

std::unordered_map is organized into buckets. Imagine having an array where every cell is a bucket that contains elements. A question might arise from these words: “that contains elements.” Are we talking about giving an array as the second parameter in the following code?

#include <unordered_map>#include <vector>
#include <string>
int main() {
  std::unordered_map<std::string, std::vector<int>> table;
  table["Word"] = { 45,6,2,6 };
}

In this case, we see clearly that there is more than one value that has to be stored. Although this example might seem reasonable and a logical motivation to have buckets, it is not the case. The bucket an element is placed into depends entirely on the hash of its key. Two different keys with different values could generate the same hash (bucket). The bucket...

Hash functions and strategies that are used to implement them

The hash function is used to convert keys into indexes of an array. The hash function should, in theory, map every potential key to a distinct slot index, but in reality, this is challenging to do.

A hash function that takes a set of items as input and maps each one to a distinct slot is called a perfect hash function. It is feasible to create a perfect hash function if we know the items and the collection won’t change. Unfortunately, there is no methodical way to build a perfect hash function given a random set of elements. Thankfully, we can still gain performance efficiency, even if the hash algorithm is not perfect.

The size of the hash table can be increased in order to include all possible values for the element range, which is one technique to ensure that the hash function is always perfect. This ensures that every component will have a unique slot. Although this is doable for small numbers of items,...

Heaps and their applications

The term heap is definitely familiar to you but, most probably, the heap we are going to talk about has nothing to do with the heap you know. When studying computer systems, it is unavoidable not to touch on topics connected with memory, especially RAM. And when talking about RAM and virtual memory, we can’t skip the part where we separate the memory into stack and heap. Is this heap, which is used for dynamic memory allocation, connected to the heap we are going to discuss? The answer can be guessed from the first sentence of this subchapter and it is no.

A heap is an abstract tree-based data structure. What does that mean? Well, it is structured as a tree and has some properties of trees, specifically a binary tree. Let us not dive deep into what types of trees are there, and just talk about what kind of binary trees do exist and to what category our heap belongs. There are full, complete, and other types of binary trees, but we are going to...

Advanced lists

We have already talked about lists in the previous chapter as node-based data structures. Knowing how lists work, it may be interesting for us to dive deeper into the data structure to understand what other variations it has. In this part of the chapter, we are going to find out whether lists can be any better or not and what other variations of lists there are.

Among the list variations are skip lists, XOR lists, unrolled lists, self-organizing lists, and so on. We are not going to talk about every advanced list type there is, but only one or two to make the idea of an advanced data structure, specifically an advanced list data structure, clearer.

Skip lists

The name “skip list” hints that the data structure should be connected with skipping something. The only thing it can skip is a node, as it is composed of nodes. This means that in some operations that are performed on lists, we can skip the accepted steps. Let us take for example the search...

Summary

In this chapter, we discussed advanced data structures such as B-trees, heaps, and advanced lists. We also talked about one of the best containers STL has to hand and also about the strategies that are used to implement the container.

First, we went for the B-tree by understanding the importance of using B-trees and also where they are mostly used. We dived deeper into one of many specializations of B-trees – a 2-3 tree. After giving the structural implementation of the 2-3 tree, we also implemented the search function for the tree and discussed insertion and deletion in the most detailed way possible.

The implementation strategies of std::unordered_map were also put into light within this chapter. The most important operations that make up std::unordered_map are hashing, collision handling, and storing strategies.

We also discussed the heap data structure by talking about its structure and the operations that it performs. The heap data structure has many applications...

Questions

List the properties that a B-tree of order m should satisfy.
What strategies can be used to handle collisions in implementing unordered_map by yourself?
What types of heaps are there and how do they differ?
Implement an insert function for the min heap.
Why is a heap used to implement a priority queue?
On what logic do the skip list and XOR list rely?