Reader small image

You're reading from  C++ Game Animation Programming - Second Edition

Product typeBook
Published inDec 2023
Reading LevelN/a
PublisherPackt
ISBN-139781803246529
Edition2nd Edition
Languages
Tools
Concepts
Right arrow
Authors (2):
Michael Dunsky
Michael Dunsky
author image
Michael Dunsky

Michael Dunsky is an educated electronics technician, game developer, and console porting programmer with more than 20 years of programming experience. He started at the age of 14 with BASIC, adding on his way Assembly language, C, C++, Java, Python, VHDL, OpenGL, GLSL, and Vulkan to his portfolio. During his career, he also gained extensive knowledge in virtual machines, server operation, infrastructure automation, and other DevOps topics. Michael holds a Master of Science degree in Computer Science from the FernUniversität in Hagen, focused on computer graphics, parallel programming and software systems.
Read more about Michael Dunsky

Gabor Szauer
Gabor Szauer
author image
Gabor Szauer

Gabor Szauer has been making games since 2010. He graduated from Full Sail University in 2010 with a bachelor's degree in game development. Gabor maintains an active Twitter presence, and maintains a programming-oriented game development blog. Gabor's previously published books are Game Physics Programming Cookbook and Lua Quick Start Guide, both published by Packt.
Read more about Gabor Szauer

View More author details
Right arrow

15

Measuring Performance and Optimizing the Code

Welcome to Chapter 15! In the previous chapter, we extended the glTF application to render a large crowd of model instances at the same time on the screen.

In this chapter, we will search for performance problems by measuring the time the application needs for some function calls, such as the calculation of the joint matrices for the vertex skinning or the upload of the matrix data into the buffers of the graphics card. This measurement allows us to find so-called hotspots, which are parts of the code that are called many times during the program execution.

First, we discuss some basic dos and don’ts of code optimization. Then, we explore a couple of different methods to make the code – at least theoretically – faster. There is no guarantee that an optimization will have a positive effect on the speed of a program, as using the wrong data type or algorithm can even slow down the code. Therefore, we need...

Technical requirements

For this chapter, you will need the OpenGL and Vulkan renderer code from Chapter 14.

Before we go into details about code optimization, let us discuss some “rules of thumb” regarding optimization in the software development process.

Measure twice, cut once!

The saying, “Measure twice, cut once,” is popular among carpenters. Cutting a wooden plank is irreversible, and if the resulting plank is too short due to inaccurate measurements, the carpenter must start over with a new plank.

Thanks to Source Code Management (SCM) software such as Git, code changes are not irreversible in the way that cutting wood is. But you will waste precious time if you start optimizing without a plan.

Always measure before you take actions

If you find a performance problem in your application, you may feel the urge to optimize it somehow. However, making code changes by following gut feelings is a bad idea, as you will most likely not end up optimizing the actual code responsible for the slow performance, instead just making assumptions about which part of the code may be slow.

So, before you dive into the code and try your best to make it faster, you should at least start measuring the times taken by different...

Moving computations to different places

Even with the current multi-core processors and several GHz of core frequencies, CPU power is still a scarce and precious resource. Every CPU cycle you waste by doing unnecessary calculations, using the wrong algorithms, or repeating operations is lost for the remaining parts of the program. Therefore, it is important to identify how to save CPU cycles while still doing the intended computations.

Recalculate only when necessary

There are essentially two opposite paths available to optimize code. You can try to optimize the code in a way that computes the results on every call with low overhead – or you can be lazy, cache the results, and recalculate new results only when some of the parameters have changed.

Both paths have their pros and cons. While continuously computed results will produce smooth and uniform calculation times in the functions, you do a lot of unnecessary operations if the input values never change. With the...

Profiling the code to find hotspots

For code performance profiling, the executable is instrumented by a profiling tool, and every function call is counted, including the execution time. Depending on the OS and the compiler, different settings are required to enable proper application profiling.

We will now begin with a practical profiling session and search for hotspots in the code used in Chapter 14: 03_opengl_instanced_drawing. The optimized code can be found in the folder for chapter15 in the 01_opengl_optimize subfolder.

Profiling code using Visual Studio

Visual Studio comes with an internal performance profiler. The profiler can be started in the Debug menu of Visual Studio, as shown in Figure 15.1:

Figure 15.1: Starting the profiler from Visual Studio 2022

Figure 15.1: Starting the profiler from Visual Studio 2022

In the new Visual Studio tab, select Executable as the desired Analysis Target. Navigate to the chapter14\03_opengl_instanced_drawing folder and select the executable file in the following...

Using RenderDoc to analyze a GPU frame

RenderDoc is a free tool to capture and analyze the frames our application draws. The program supports OpenGL and Vulkan, and also Direct 3D on Windows and OpenGL ES on mobile devices.

In Figure 15.9, a single frame of the 04_opengl_tbo example of Chapter 14 has been captured:

Figure 15.9: RenderDoc analyzing an OpenGL version of the model viewer

Figure 15.9: RenderDoc analyzing an OpenGL version of the model viewer

In Figure 15.9, at the top of the RenderDoc window labeled with number 1, the overall timing of the frame is shown. On the left side, at number 2, the recorded OpenGL calls are presented. Selecting one block or command advances the frame and the timing bar in the window with the number 1 to the frame state at that specific time.

The colored bars at number 3 are the joint matrices that were uploaded to the texture buffer. We use a texture buffer to upload the matrix data to the GPU, and the uploaded data is visible as a one-dimensional texture in RenderDoc. On the lower...

Scale it up and do A/B tests

At this point, the optimization journey has just begun. After the first rounds of digging into possible performance issues for the processor and the graphics card, you need to re-iterate the status of the application. Two pieces of advice will help you to tickle even more frames per second out of your application.

Scale up to get better results

If we profile the first version of your shiny new glTF model viewer application from Chapter 8, where we are loading and rendering only the single model, the results may lead to the wrong conclusions. The differences between the calls are too small to allow us to discern the cause for any slowdowns, and many generic calls to STL or GLM functions are shown, as you can see in Figure 15.16:

Figure 15.16: Profiling the code from Chapter 8, example 01_opengl_gltf_load

Figure 15.16: Profiling the code from Chapter 8, example 01_opengl_gltf_load

If you start optimizing on the basis of these results, you will waste your time working on completely the wrong parts...

Summary

In this chapter, we explored performance measurements and optimization of the code we created throughout all the chapters of this book.

First, we looked at the basic dos and don’ts of optimization. You should do any optimizations as late as possible and avoid premature optimization at all costs, as it will slow down the development and eventually delay your product. Also, we talked about some basic ideas on how to make code run faster.

Next, we checked our code examples from Chapter 14 for hotspots and bottlenecks on both the CPU and GPU sides. By using a profiling tool, we detected the code parts where the processor spent more time than necessary. RenderDoc helped us to analyze the frames that are sent from the application to the graphics card, and to compare the effects of different variants of the rendering code sent to the GPU.

Finally, two pieces of advice for the optimization process were given. Scaling up the application helps you to find the real bottlenecks...

Practical sessions

You can try out these ideas to get deeper insights into the process of code optimization:

  • Search for more hotspots using a profiler and try to reduce the calculation time for every instance even more.

    The optimized code from Chapter 15 needs about 0.02 milliseconds for the creation of the joint matrices or dual quaternions of every model on a recent CPU. For 1,000 models drawn using the GPU instancing, the matrix data update takes about 20 milliseconds per frame. Maybe you will find more places where a couple of CPU cycles can be saved.

  • Advanced difficulty: Use multithreading for the update of the matrix data.

    You could try to update more than one model at once by parallelizing the joint matrix update process. This may be done by a simple worker or consumer/producer model, where you add the update tasks to a list or vector and let the threads take the topmost entry to work on the matrices. But beware, synchronization between threads can be difficult, and...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
C++ Game Animation Programming - Second Edition
Published in: Dec 2023Publisher: PacktISBN-13: 9781803246529
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Michael Dunsky

Michael Dunsky is an educated electronics technician, game developer, and console porting programmer with more than 20 years of programming experience. He started at the age of 14 with BASIC, adding on his way Assembly language, C, C++, Java, Python, VHDL, OpenGL, GLSL, and Vulkan to his portfolio. During his career, he also gained extensive knowledge in virtual machines, server operation, infrastructure automation, and other DevOps topics. Michael holds a Master of Science degree in Computer Science from the FernUniversität in Hagen, focused on computer graphics, parallel programming and software systems.
Read more about Michael Dunsky

author image
Gabor Szauer

Gabor Szauer has been making games since 2010. He graduated from Full Sail University in 2010 with a bachelor's degree in game development. Gabor maintains an active Twitter presence, and maintains a programming-oriented game development blog. Gabor's previously published books are Game Physics Programming Cookbook and Lua Quick Start Guide, both published by Packt.
Read more about Gabor Szauer