Packt+ | Advance your knowledge in tech

You're reading from Advanced C++

Product typeBook

Published inOct 2019

Reading LevelIntermediate

Publisher

ISBN-139781838821135

Edition1st Edition

Languages

C++

Concepts

Programming Language

Authors (5):

Gazihan Alankus

Olena Lizina

Rakesh Mane

Vivek Nagarajan

Brian Price

View More author details

Chapter 8 - Need for Speed – Performance and Optimization

Activity 1: Optimizing a Spell Check Algorithm

In this activity, we'll be developing a simple spell check demonstration and try to make it faster incrementally. You can use the skeleton file, Speller.cpp, as a starting point. Perform the following steps to implement this activity:

For the first implementation of the spell check (the full code can be found in Speller1.cpp) – create a dictionary set in the getMisspelt() function:

set<string> setDict(vecDict.begin(), vecDict.end());

Loop over the text words and check for words not in the dictionary with the set::count() method. Add the misspelled words to the result vector:

vector<int> ret;

for(int i = 0; i < vecText.size(); ++i)

{

const string &s = vecText[i];

if(!setDict.count(s))

{

ret.push_back(i);

}

};

Open the terminal. Compile the program and run it as follows:

$ g++ -O3 Speller1.cpp Timer.cpp

$ ./a.out

The following output will be generated:

Figure 8.60: Example output of the solution for Step 1

Open the Speller2.cpp file and add the unordered_set header file to the program:

#include <unordered_set>

Next, change the set type that's used for the dictionary to unordered_set:

unordered_set<string> setDict(vecDict.begin(), vecDict.end());

Open the Terminal. Compile the program and run it as follows:

$ g++ -O3 Speller2.cpp Timer.cpp

$ ./a.out

The following output will be generated:

Figure 8.61: Example output of the solution for Step 2

For the third and final version, that is, Speller3.cpp, we will use a bloom filter. Start by defining a hash function based on the BKDR function. Add the following code to implement this:

const size_t SIZE = 16777215;

template<size_t SEED> size_t hasher(const string &s)

{

size_t h = 0;

size_t len = s.size();

for(size_t i = 0; i < len; i++)

{

h = h * SEED + s[i];

}

return h & SIZE;

}

Here, we used an integer template parameter so that we can create any number of different hash functions with the same code. Notice the use of the 16777215 constant, which is equal to 2^24 – 1. This lets us use the fast bitwise-and operator instead of the modulus operator to keep the hashed integer less than SIZE. If you want to change the size, keep it as one less than a power of two.

Next, let's declare a vector<bool> for a bloom filter in getMisspelt() and populate it with the words in the dictionary. Use three hash functions. The BKDR hash can be seeded with values such as 131, 3131, 31313, and so on. Add the following code to implement this:

vector<bool> m_Bloom;

m_Bloom.resize(SIZE);

for(auto i = vecDict.begin(); i != vecDict.end(); ++i)

{

m_Bloom[hasher<131>(*i)] = true;

m_Bloom[hasher<3131>(*i)] = true;

m_Bloom[hasher<31313>(*i)] = true;

}

Write the following code to create a loop that checks the words:

for(int i = 0; i < vecText.size(); ++i)

{

const string &s = vecText[i];

bool hasNoBloom =

!m_Bloom[hasher<131>(s)]

&& !m_Bloom[hasher<3131>(s)]

&& !m_Bloom[hasher<31313>(s)];

if(hasNoBloom)

{

ret.push_back(i);

}

else if(!setDict.count(s))

{

ret.push_back(i);

}

The bloom filter is checked first and if it finds the word in the dictionary, we have to verify it, like we did previously.

Open the terminal. Compile the program and run it as follows:

$ g++ -O3 Speller3.cpp Timer.cpp

$ ./a.out

The following output will be generated:

Figure 8.62: Example output of the solution for Step 3

In the preceding activity, we attempted to solve a real-world problem and make it more efficient. Let's consider some points for each of the implementations in the three steps, as follows:

For the first version, the most obvious solution with a std::set is used – however, the performance is likely to be low because the set data structure is based on a binary tree, which has O(log N) complexity for finding an element.

For the second version, we can gain a large performance improvement simply by switching to std::unordered_set, which uses a hash table as the underlying data structure. If the hash function is good, the performance will be close to O(1).

The third version, based on the Bloom filter data structure, requires some consideration.-The primary performance benefit of a bloom filter is because it is a compact data structure that does not actually store the actual elements in it, thereby providing very good cache performance.

From an implementation perspective, the following guidelines apply:

vector<bool> can be used as the backing store as it is an efficient way to store and retrieve bits.

The false positive percentage of the bloom filter should be minimal – anything more than 5% will not be efficient.

There are many string hashing algorithms – the BKDR hash algorithm is used in the reference implementation. A comprehensive of string hash algorithms with implementation can be found here: http://www.partow.net/programming/hashfunctions/index.html.
The number of hash functions and the size for the bloom filter that's used are very critical to get the performance benefits.
The nature of the dataset should be taken into account when deciding what parameters the bloom filter should use – consider that, in this example, there are very few words that are misspelled, and the majority of them are in the dictionary.

There are some questions worth probing, given the results we received:

Why is the improvement in performance so meager with the Bloom Filter?
What is the effect of using a larger or smaller capacity Bloom filter?
What happens when fewer or more hash functions are used?
Under what conditions would this version be much faster than the one in Speller2.cpp?

Here are the answers to these questions:

Why is the improvement in performance so meager with the Bloom Filter?
std::unordered_set performs one hash operation and perhaps a couple of memory accesses before reaching the value that's stored. The Bloom filter we use performs three hash operations and three memory accesses. So, in essence, the work that's done by the bloom filter is more than the hash table. Since there are only 31,870 words in our dictionary, the cache benefits of the Bloom filter are lost. This is another case where the traditional analysis of data structures does not correspond to real-life results because of caching.
What is the effect of using a larger or smaller capacity Bloom filter?
When a larger capacity is used, the number of hash collisions reduce, along with false positives, but the caching behavior worsens. Conversely, when a smaller capacity is used, the hash collisions and the false positives increase, but the caching behavior improves.
What happens when fewer or more hash functions are used?
The more hash functions are used, the fewer the false positives, and vice versa.
Under what conditions would this version be much faster than the one in Speller2.cpp?
Bloom filters work best when the cost of testing a few bits is less than the cost of accessing the value in the hash table. This only becomes true when the Bloom filter bits fit completely within the cache and the dictionary does not.

The rest of the page is locked

You have been reading a chapter from

Advanced C++

Published in: Oct 2019Publisher: ISBN-13: 9781838821135

Authors (5)

Gazihan Alankus

Gazihan Alankus holds a PhD in computer science from Washington University in St. Louis. Currently, he is an assistant professor at the Izmir University of Economics in Turkey. He teaches and conducts research on game development, mobile application development, and human-computer interaction. He is a Google developer expert in Dart and develops Flutter applications with his students in his company Gbot, which he founded in 2019.
Read more about Gazihan Alankus

Olena Lizina

Olena Lizina is a software developer with 5 years experience in C++. She has practical knowledge of developing systems for monitoring and managing remote computers with a lot of users for an international product company. For the last 4 years, she has been working for international outsourcing companies on automotive projects for well-known automotive concerns. She has been participating in the development of complex and high-performance applications on different projects, such as HMI (Human Machine Interface), navigation, and applications for work with sensors.
Read more about Olena Lizina

Rakesh Mane

Rakesh Mane has over 18 years of experience in the software industry. He has worked with proficient programmers from a variety of regions such as India, the US, and Singapore. He has mostly worked in C++, Python, shell scripting, and database. In his spare time, he likes to listen to music and travel. Also, he likes to play with, experiment with, and break things using software tools and code.
Read more about Rakesh Mane

Vivek Nagarajan

Vivek Nagarajan is a self-taught programmer who started out in the 1980s on 8-bit systems. He has worked on a large number of software projects and has 14 years of professional experience with C++. Aside from this, he has worked on a wide variety of languages and frameworks across the years. He is an amateur powerlifter, DIY enthusiast, and motorcycle racer. He currently works as an independent software consultant.
Read more about Vivek Nagarajan

Brian Price

Brian Price has over 30 years experience working in a variety of languages, projects, and industries, including over 20 years' experience in C++. He was worked on power station simulators, SCADA systems, and medical devices. He is currently crafting software in C++, CMake, and Python for a next-generation medical device. He enjoys solving puzzles and the Euler project in a variety of languages.
Read more about Brian Price

Other recommended products

Related to this chapter

The C++ Standard Library

The C++ Standard Library is a concise overview of the latest C++ 17 libraries. It's a fast-paced and easy reference that will help you speed up your programming.

BookDec 2019251 pages

Advanced C++ Programming Cookbook

This book is for C++ developers with a good understanding of the language and an interest in advanced language features, who want to obtain expert skills to solve recurring problems with tailormade solutions.

BookJan 2020454 pages

Boost C++ Application Development Cookbook

Are you facing trouble in multithreading with C++? Is metaprogramming getting a bit too tedious? Boost C++ libraries will ease all this with a collection of modern libraries. Boost libraries are developed by professionals, tested on multiple platforms and processor architectures, and contain reliable solutions for a wide range of tasks.

BookAug 2017438 pages

Mastering C++ Programming

C++ has come a long way and has now been adopted in several contexts. Its key strengths are its software infrastructure and resource-constrained applications. The C++ 17 release will change the way developers write code and this book will help you master your developing skills with C++.

BookSep 2017384 pages

C++17 STL Cookbook

C++ has come a long way and is in use in every area of the industry. Fast, efficient, and flexible, it is used to solve many problems. The upcoming version of C++ will see programmers change the way they code. If you want to grasp the practical usefulness of the C++ 17 STL in order to write smarter, fully portable code, then this book is for you.

BookJun 2017532 pages

Modern C++ Programming Cookbook

Modern C++ Programming Cookbook, Second Edition steps up your C++ knowledge by deep-diving into the most important language and library features, including containers, algorithms, regular expressions, threads, and more. This edition comes updated with new recipes on core C++20 features, including modules, concepts, and coroutines, and C++20 library features, such as ranges, the formatting library, the date library, and the thread support library.

BookSep 2020750 pages

Modern C++ Programming Cookbook

The latest versions of C++ have seen programmers change the way they code, giving up on the old-fashioned C-style programming and adopting modern C++ instead. Starting with modern language features, this recipe-based guide will show you how to solve specific problems by explaining the solution and offers insights into how modern C++ works.

BookMay 2017590 pages

C++ Fundamentals

Discover the peculiar feature points of C++ with C++ Fundamentals, and lay a solid foundation of C++ knowledge. Get a hands-on, practical introduction to low-level programming with C and C++.

BookMar 2019350 pages

Hands-On System Programming with C++

C++ is a general-purpose programming language with a bias towards systems programming. This book provides a detailed walkthrough of the C, C++ and POSIX standards and enables a firm understanding of various system calls for UNIX systems. Topics include console and file IO, memory management, sockets, time interface, process and thread management

BookDec 2018552 pages

Embedded Programming with Modern C++ Cookbook

This book is a collection of practical examples for understanding how embedded development is different from other desktop application development. You’ll learn to build an embedded application and use specialized memory and custom allocators. By the end of the book, you’ll be able to build robust and secure embedded applications with C++20.

BookApr 2020412 pages

Beginning C++ Programming

C++ has come a long way and is now adopted in several contexts. Its key strengths are its software infrastructure and resource-constrained applications, including desktop applications, servers, and performance-critical applications, not to forget its importance in game programming. Despite its strengths in these areas, beginners usually tend to shy away from learning the language because of its steep learning curve. The main mission of this book is to make you familiar and comfortable with C++. You will finish the book not only being able to write your own code, but more importantly, you will be able to read other projects. It is only by being able to read others' code that you will progress from a beginner to an advanced programmer. This book is the first step in that progression.

BookApr 2017526 pages

C++ System Programming Cookbook

Systems programming is about writing software that interacts closely with the underlying OS and other software. The goal of C++ Systems Programming Cookbook is to provide ready to use solutions for the essential aspects of system programming. By the end of the book, you will become adept at developing robust systems applications using C++.

BookFeb 2020292 pages

Personalised recommendations for you

Based on your interests and search pattern

C++ Programming for Linux Systems

This book covers the essential system programming tools and helps you explore the features of C++20. It emphasizes important details to maintain code quality and tackle everyday challenges of developing software for high performance, optimization, and more.

BookSep 2023288 pages

Expert C++

Discover advanced programming techniques, the latest features of C++17 and C++20, and best practices for memory management, debugging, testing, and large-scale application design with Expert C++. Ideal for experienced developers advancing to proficient programmers and building professional-grade C++ applications.

BookAug 2023604 pages

iOS 17 Programming for Beginners

iOS 17 Programming for Beginners, Eighth Edition is your comprehensive guide to learning the art of iOS app development. Whether you dream of creating the next chart-topping app or simply want to enhance your programming skills, this book is your trusted companion on this exciting journey.

BookOct 2023604 pages4

Developer Career Masterplan

Written by industry experts that have spent the last 20+ years helping developers grow their career path towards senior developer positions and beyond. This book provides a comprehensive guide, sharing examples and stories from their global careers. By the end, you’ll have the knowledge to create a clear career progression plan as a technical professional.

BookSep 2023310 pages

Refactoring with C#

In Refactoring with C#, you’ll explore the process of safely refactoring modern .NET code using Visual Studio features, advanced unit tests, AI assistance, and custom Roslyn analyzers.

BookNov 2023434 pages

Python Real-World Projects

Amplify your developer journey by curating a dynamic project portfolio that outshines traditional resumes. Delve into the Python realm through immersive projects, mastering core concepts while constructing comprehensive modules and applications. From data acquisition prowess to impactful data visualization, Python Real-World Projects arms you with essential skills to beat the competition.

BookSep 2023478 pages5

The MVVM Pattern in .NET MAUI

The MVVM Pattern in .NET MAUI enables developers to master MVVM principles and effectively apply them to .NET MAUI. This book uses real-life examples and covers complex problems to help you successfully apply MVVM with .NET MAUI to confidently develop robust and high-performing cross-platform apps.

BookNov 2023386 pages

Extending Microsoft Business Central with Power Platform

Extending Business Central with the Power Platform is a step-by-step guide for Business Central professionals to create solutions that automate business processes, explain complex workflow approvals, and integrate with hundreds of other systems, without traditional development. It’ll guide you in customizing Business Central with Power Platform.

BookAug 2023458 pages5

Extending Microsoft Business Central with Power Platform

Extending Business Central with the Power Platform is a step-by-step guide for Business Central professionals to create solutions that automate business processes, explain complex workflow approvals, and integrate with hundreds of other systems, without traditional development. It’ll guide you in customizing Business Central with Power Platform.

BookAug 2023458 pages5

Quantum Computing Algorithms

The book emphasizes intuitive ideas behind quantum algorithms in ways that other books don’t cover, striking a careful balance between no math and too much math. To get the most from this book, you should be comfortable with basic algebra and writing simple computer code. No prior understanding of quantum physics is needed to get started.

BookSep 2023342 pages

Python – Complete Python, Django, Data Science and ML Guide

Unlock Python's full potential with this 50+ hour course! From programming to web and game development, data manipulation, and machine learning, gain the skills required to succeed in various Python-related careers. With practical tasks, hands-on experience, and a strong foundation in Python, you'll be ready to tackle real-world challenges and take advantage of the many opportunities this versatile language offers.

VideoNov 202350 hours 30 minutes5

Python – Complete Python, Django, Data Science and ML Guide

Unlock Python's full potential with this 50+ hour course! From programming to web and game development, data manipulation, and machine learning, gain the skills required to succeed in various Python-related careers. With practical tasks, hands-on experience, and a strong foundation in Python, you'll be ready to tackle real-world challenges and take advantage of the many opportunities this versatile language offers.

VideoNov 202350 hours 30 minutes5

You're reading from Advanced C++

Chapter 8 - Need for Speed – Performance and Optimization

Activity 1: Optimizing a Spell Check Algorithm

Figure 8.60: Example output of the solution for Step 1

Figure 8.61: Example output of the solution for Step 2

Figure 8.62: Example output of the solution for Step 3

Unlock this book and the full library FREE for 7 days

Authors (5)

The C++ Standard Library

The C++ Standard Library is a concise overview of the latest C++ 17 libraries. It's a fast-paced and easy reference that will help you speed up your programming.

Advanced C++ Programming Cookbook

This book is for C++ developers with a good understanding of the language and an interest in advanced language features, who want to obtain expert skills to solve recurring problems with tailormade solutions.

Boost C++ Application Development Cookbook

Mastering C++ Programming

C++ has come a long way and has now been adopted in several contexts. Its key strengths are its software infrastructure and resource-constrained applications. The C++ 17 release will change the way developers write code and this book will help you master your developing skills with C++.

C++17 STL Cookbook

Modern C++ Programming Cookbook

Modern C++ Programming Cookbook

C++ Fundamentals

Discover the peculiar feature points of C++ with C++ Fundamentals, and lay a solid foundation of C++ knowledge. Get a hands-on, practical introduction to low-level programming with C and C++.

Hands-On System Programming with C++

Embedded Programming with Modern C++ Cookbook

Beginning C++ Programming

C++ System Programming Cookbook

C++ Programming for Linux Systems

This book covers the essential system programming tools and helps you explore the features of C++20. It emphasizes important details to maintain code quality and tackle everyday challenges of developing software for high performance, optimization, and more.

Expert C++

iOS 17 Programming for Beginners

iOS 17 Programming for Beginners, Eighth Edition is your comprehensive guide to learning the art of iOS app development. Whether you dream of creating the next chart-topping app or simply want to enhance your programming skills, this book is your trusted companion on this exciting journey.

Developer Career Masterplan

Refactoring with C#

In Refactoring with C#, you’ll explore the process of safely refactoring modern .NET code using Visual Studio features, advanced unit tests, AI assistance, and custom Roslyn analyzers.

Python Real-World Projects

The MVVM Pattern in .NET MAUI

The MVVM Pattern in .NET MAUI enables developers to master MVVM principles and effectively apply them to .NET MAUI. This book uses real-life examples and covers complex problems to help you successfully apply MVVM with .NET MAUI to confidently develop robust and high-performing cross-platform apps.

Extending Microsoft Business Central with Power Platform

Extending Microsoft Business Central with Power Platform

Quantum Computing Algorithms

Python – Complete Python, Django, Data Science and ML Guide

Python – Complete Python, Django, Data Science and ML Guide