You're reading from C++ Game Animation Programming - Second Edition

Product typeBook

Published inDec 2023

Reading LevelN/a

PublisherPackt

ISBN-139781803246529

Edition2nd Edition

Languages

C++

Tools

OpenGL

Concepts

Animation

Authors (2):

Michael Dunsky

Gabor Szauer

View More author details

15

Measuring Performance and Optimizing the Code

Welcome to Chapter 15! In the previous chapter, we extended the glTF application to render a large crowd of model instances at the same time on the screen.

In this chapter, we will search for performance problems by measuring the time the application needs for some function calls, such as the calculation of the joint matrices for the vertex skinning or the upload of the matrix data into the buffers of the graphics card. This measurement allows us to find so-called hotspots, which are parts of the code that are called many times during the program execution.

First, we discuss some basic dos and don’ts of code optimization. Then, we explore a couple of different methods to make the code – at least theoretically – faster. There is no guarantee that an optimization will have a positive effect on the speed of a program, as using the wrong data type or algorithm can even slow down the code. Therefore, we need...

Technical requirements

For this chapter, you will need the OpenGL and Vulkan renderer code from Chapter 14.

Before we go into details about code optimization, let us discuss some “rules of thumb” regarding optimization in the software development process.

Measure twice, cut once!

The saying, “Measure twice, cut once,” is popular among carpenters. Cutting a wooden plank is irreversible, and if the resulting plank is too short due to inaccurate measurements, the carpenter must start over with a new plank.

Thanks to Source Code Management (SCM) software such as Git, code changes are not irreversible in the way that cutting wood is. But you will waste precious time if you start optimizing without a plan.

Always measure before you take actions

If you find a performance problem in your application, you may feel the urge to optimize it somehow. However, making code changes by following gut feelings is a bad idea, as you will most likely not end up optimizing the actual code responsible for the slow performance, instead just making assumptions about which part of the code may be slow.

So, before you dive into the code and try your best to make it faster, you should at least start measuring the times taken by different...

Moving computations to different places

Even with the current multi-core processors and several GHz of core frequencies, CPU power is still a scarce and precious resource. Every CPU cycle you waste by doing unnecessary calculations, using the wrong algorithms, or repeating operations is lost for the remaining parts of the program. Therefore, it is important to identify how to save CPU cycles while still doing the intended computations.

Recalculate only when necessary

There are essentially two opposite paths available to optimize code. You can try to optimize the code in a way that computes the results on every call with low overhead – or you can be lazy, cache the results, and recalculate new results only when some of the parameters have changed.

Both paths have their pros and cons. While continuously computed results will produce smooth and uniform calculation times in the functions, you do a lot of unnecessary operations if the input values never change. With the...

Profiling the code to find hotspots

For code performance profiling, the executable is instrumented by a profiling tool, and every function call is counted, including the execution time. Depending on the OS and the compiler, different settings are required to enable proper application profiling.

We will now begin with a practical profiling session and search for hotspots in the code used in Chapter 14: 03_opengl_instanced_drawing. The optimized code can be found in the folder for chapter15 in the 01_opengl_optimize subfolder.

Profiling code using Visual Studio

Visual Studio comes with an internal performance profiler. The profiler can be started in the Debug menu of Visual Studio, as shown in Figure 15.1:

Figure 15.1: Starting the profiler from Visual Studio 2022

In the new Visual Studio tab, select Executable as the desired Analysis Target. Navigate to the chapter14\03_opengl_instanced_drawing folder and select the executable file in the following...

Using RenderDoc to analyze a GPU frame

RenderDoc is a free tool to capture and analyze the frames our application draws. The program supports OpenGL and Vulkan, and also Direct 3D on Windows and OpenGL ES on mobile devices.

In Figure 15.9, a single frame of the 04_opengl_tbo example of Chapter 14 has been captured:

Figure 15.9: RenderDoc analyzing an OpenGL version of the model viewer

In Figure 15.9, at the top of the RenderDoc window labeled with number 1, the overall timing of the frame is shown. On the left side, at number 2, the recorded OpenGL calls are presented. Selecting one block or command advances the frame and the timing bar in the window with the number 1 to the frame state at that specific time.

The colored bars at number 3 are the joint matrices that were uploaded to the texture buffer. We use a texture buffer to upload the matrix data to the GPU, and the uploaded data is visible as a one-dimensional texture in RenderDoc. On the lower...

Scale it up and do A/B tests

At this point, the optimization journey has just begun. After the first rounds of digging into possible performance issues for the processor and the graphics card, you need to re-iterate the status of the application. Two pieces of advice will help you to tickle even more frames per second out of your application.

Scale up to get better results

If we profile the first version of your shiny new glTF model viewer application from Chapter 8, where we are loading and rendering only the single model, the results may lead to the wrong conclusions. The differences between the calls are too small to allow us to discern the cause for any slowdowns, and many generic calls to STL or GLM functions are shown, as you can see in Figure 15.16:

Figure 15.16: Profiling the code from Chapter 8, example 01_opengl_gltf_load

If you start optimizing on the basis of these results, you will waste your time working on completely the wrong parts...

Summary

In this chapter, we explored performance measurements and optimization of the code we created throughout all the chapters of this book.

First, we looked at the basic dos and don’ts of optimization. You should do any optimizations as late as possible and avoid premature optimization at all costs, as it will slow down the development and eventually delay your product. Also, we talked about some basic ideas on how to make code run faster.

Next, we checked our code examples from Chapter 14 for hotspots and bottlenecks on both the CPU and GPU sides. By using a profiling tool, we detected the code parts where the processor spent more time than necessary. RenderDoc helped us to analyze the frames that are sent from the application to the graphics card, and to compare the effects of different variants of the rendering code sent to the GPU.

Finally, two pieces of advice for the optimization process were given. Scaling up the application helps you to find the real bottlenecks...

Practical sessions

You can try out these ideas to get deeper insights into the process of code optimization:

Search for more hotspots using a profiler and try to reduce the calculation time for every instance even more.
The optimized code from Chapter 15 needs about 0.02 milliseconds for the creation of the joint matrices or dual quaternions of every model on a recent CPU. For 1,000 models drawn using the GPU instancing, the matrix data update takes about 20 milliseconds per frame. Maybe you will find more places where a couple of CPU cycles can be saved.
Advanced difficulty: Use multithreading for the update of the matrix data.
You could try to update more than one model at once by parallelizing the joint matrix update process. This may be done by a simple worker or consumer/producer model, where you add the update tasks to a list or vector and let the threads take the topmost entry to work on the matrices. But beware, synchronization between threads can be difficult, and...

Additional resources

For further reading, please check these links:

Linux profiling: http://euccas.github.io/blog/20170827/cpu-profiling-tools-on-linux.html
Windows profiling: https://learn.microsoft.com/en-us/visualstudio/profiling/cpu-usage?view=vs-2022
Multithreading in C++: https://db.in.tum.de/teaching/ss21/c++praktikum/slides/lecture-10.2.pdf
Mastering multithreading: https://www.packtpub.com/product/mastering-c-multithreading/9781787121706
Concurrency with Modern C++: https://www.grimm-jaud.de/index.php/concurrency-with-modern-c
OpenGL compute shaders: https://antongerdelan.net/opengl/compute.html
Vulkan compute shaders: https://saschawillems.de/vulkantutorial/en/Compute_Shader.html
RenderDoc documentation: https://renderdoc.org/docs/index.html
C++ constexpr and consteval: https://lemire.me/blog/2023/03/27/c20-consteval-and-constexpr-functions/
glTF Sample Models: https://github.com/KhronosGroup/glTF-Sample-Models
Asset Importer...

The rest of the chapter is locked

You have been reading a chapter from

C++ Game Animation Programming - Second Edition

Published in: Dec 2023Publisher: PacktISBN-13: 9781803246529

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Michael Dunsky

Michael Dunsky is an educated electronics technician, game developer, and console porting programmer with more than 20 years of programming experience. He started at the age of 14 with BASIC, adding on his way Assembly language, C, C++, Java, Python, VHDL, OpenGL, GLSL, and Vulkan to his portfolio. During his career, he also gained extensive knowledge in virtual machines, server operation, infrastructure automation, and other DevOps topics. Michael holds a Master of Science degree in Computer Science from the FernUniversität in Hagen, focused on computer graphics, parallel programming and software systems.
Read more about Michael Dunsky

Gabor Szauer

Gabor Szauer has been making games since 2010. He graduated from Full Sail University in 2010 with a bachelor's degree in game development. Gabor maintains an active Twitter presence, and maintains a programming-oriented game development blog. Gabor's previously published books are Game Physics Programming Cookbook and Lua Quick Start Guide, both published by Packt.
Read more about Gabor Szauer

Personalised recommendations for you

Based on your interests and search pattern

Engineering Manager's Handbook

Engineering Manager's Handbook is a comprehensive guide for managers to excel in their role, foster customer-centric digital products, learn leadership, team building, and balancing technical work with management. You’ll also explore how to develop trust, authority, and collaboration to drive success and make a lasting impact.

BookSep 2023278 pages

C++ Game Animation Programming

Video game characters have a fascinating history, evolving from simple 2D sprites to high-polygon 3D models. Take a look behind the curtain and learn how to build a 3D renderer, load character models, play animations and blend between them, and create large crowds of animated people with this comprehensive C++ game animation programming guide.

BookDec 2023480 pages

Gamification for Product Excellence

This book helps you to take your product management strategy to the next level by standing out in crowded markets. Along with boosting user adoption rates by creating engaging products that incorporate playful elements, learn gamification theory and how to integrate it into your design, product development, and product management processes.

BookSep 2023350 pages

Supercharging Productivity with Trello

Supercharging Productivity with Trello is the ultimate guide for anyone looking to boost their productivity with digital tools. Whether you're new to Trello or a seasoned professional, this book covers everything from core features to advanced automation, and Power-Ups.

BookAug 2023342 pages

Automate It with Zapier and Generative AI

This comprehensive guide takes you through the concepts of business process automation, showing you how Zapier can facilitate it without having to write code and helping you to boost productivity. You’ll learn how to save time, reduce costs, and make your business recession-proof by using Zapier to automate tasks in your cloud-based business apps.

BookAug 2023706 pages

Scoring to Picture in Logic Pro

In this book, you’ll explore a variety of techniques to synchronize music to picture using Logic Pro. Though this is not a technical manual, it will teach you how to make the best use of Logic Pro and how to wield this technology to maximize your potential when scoring to picture.

BookSep 2023412 pages

Mastering Information Security Compliance Management

This concise book equips you with the knowledge and practices needed to establish and maintain an effective information security management system. The chapters provide insights into ISO/IEC 27001/27002:2022, risk management, ISMS development, incident management, audit processes, and strategies for continuous improvement.

BookAug 2023236 pages1

Implementing Atlassian Confluence

Implementing Atlassian Confluence provides both a high-level overview and an insightful path for remote collaboration with Atlassian Confluence. With this multi-layered yet practical guide, you’ll be able to set up Confluence-based collaboration with minimum external consultancy services to ensure smooth and close coordination between teams.

BookSep 2023406 pages

R Bioinformatics Cookbook

This book takes a unique problem–solution approach to handling complex tasks in the bioinformatics domain using different datasets present in the book. With the help of real-world examples, you’ll learn to put each independent recipe to use to tackle problems in the field of bioinformatics.

BookOct 2023396 pages

Build Your Own Metaverse with Unity

Build Your own Metaverse with Unity is a practical guide for developers to create their own metaverse - a virtual world with infinite possibilities. It empowers you to identify gaps in existing metaverses and improve upon them, enabling you to shape your virtual world.

BookSep 2023586 pages5