Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Mastering Graphics Programming with Vulkan

You're reading from  Mastering Graphics Programming with Vulkan

Product type Book
Published in Feb 2023
Publisher Packt
ISBN-13 9781803244792
Pages 382 pages
Edition 1st Edition
Languages
Authors (2):
Marco Castorina Marco Castorina
Profile icon Marco Castorina
Gabriel Sassone Gabriel Sassone
Profile icon Gabriel Sassone
View More author details

Table of Contents (21) Chapters

Preface 1. Part 1: Foundations of a Modern Rendering Engine
2. Chapter 1: Introducing the Raptor Engine and Hydra 3. Chapter 2: Improving Resources Management 4. Chapter 3: Unlocking Multi-Threading 5. Chapter 4: Implementing a Frame Graph 6. Chapter 5: Unlocking Async Compute 7. Part 2: GPU-Driven Rendering
8. Chapter 6: GPU-Driven Rendering 9. Chapter 7: Rendering Many Lights with Clustered Deferred Rendering 10. Chapter 8: Adding Shadows Using Mesh Shaders 11. Chapter 9: Implementing Variable Rate Shading 12. Chapter 10: Adding Volumetric Fog 13. Part 3: Advanced Rendering Techniques
14. Chapter 11: Temporal Anti-Aliasing 15. Chapter 12: Getting Started with Ray Tracing 16. Chapter 13: Revisiting Shadows with Ray Tracing 17. Chapter 14: Adding Dynamic Diffuse Global Illumination with Ray Tracing 18. Chapter 15: Adding Reflections with Ray Tracing 19. Index 20. Other Books You May Enjoy

Unlocking Multi-Threading

In this chapter, we will talk about adding multi-threading to the Raptor Engine.

This requires both a big change in the underlying architecture and some Vulkan-specific changes and synchronization work so that the different cores of the CPU and the GPU can cooperate in the most correct and the fastest way.

Multi-threading rendering is a topic covered many times over the years and a feature that most game engines have needed since the era of multi-core architectures exploded. Consoles such as the PlayStation 2 and the Sega Saturn already offered multi-threading support, and later generations continued the trend by providing an increasing number of cores that developers could take advantage of.

The first trace of multi-threading rendering in a game engine is as far back as 2008 when Christer Ericson wrote a blog post (https://realtimecollisiondetection.net/blog/?p=86) and showed that it was possible to parallelize and optimize the generation of commands...

Technical requirements

Task-based multi-threading using enkiTS

To achieve parallelism, we need to understand some basic concepts and choices that led to the architecture developed in this chapter. First, we should note that when we talk about parallelism in software engineering, we mean the act of executing chunks of code at the same time.

This is possible because modern hardware has different units that can be operated independently, and operating systems have dedicated execution units called threads.

A common way to achieve parallelism is to reason with tasks – small independent execution units that can run on any thread.

Why task-based parallelism?

Multi-threading is not a new subject, and since the early years of it being added to various game engines, there have been different ways of implementing it. Game engines are pieces of software that use all of the hardware available in the most efficient way, thus paving the way for more optimized software architectures.

Therefore, we...

Asynchronous loading

The loading of resources is one of the (if not the) slowest operations that can be done in any framework. This is because the files to be loaded are big, and they can come from different sources, such as optical units (DVD and Blu-ray), hard drives, and even the network.

It is another great topic, but the most important concept to understand is the inherent speed necessary to read the memory:

Figure 3.1 – A memory hierarchy

Figure 3.1 – A memory hierarchy

As shown in the preceding diagram, the fastest memory is the registers memory. After registers follows the cache, with different levels and access speeds: both registers and caches are directly in the processing unit (both the CPU and GPU have registers and caches, even with different underlying architectures).

Main memory refers to the RAM, which is the area that is normally populated with the data used by the application. It is slower than the cache, but it is the target of the loading operations...

Recording commands on multiple threads

To record commands using multiple threads, it is necessary to use different command buffers, at least one on each thread, to record the commands and then submit them to the main queue. To be more precise, in Vulkan, any kind of pool needs to be externally synchronized by the user; thus, the best option is to have an association between a thread and a pool.

In the case of command buffers, they are allocated from the associated pool and commands registered in it. Pools can be CommandPools, DescriptorSetPools, and QueryPools (for time and occlusion queries), and once associated with a thread, they can be used freely inside that thread of execution.

The execution order of the command buffers is based on the order of the array submitted to the main queue – thus, from a Vulkan perspective, sorting can be performed on a command buffer level.

We will see how important the allocation strategy for command buffers is and how easy it is to...

Summary

In this chapter, we learned about the concept of task-based parallelism and saw how using a library such as enkiTS can quickly add multi-threading capabilities to the Raptor Engine.

We then learned how to add support for loading data from files to the GPU using an asynchronous loader. We also focused on Vulkan-related code to have a second queue of execution that can run in parallel to the one responsible for drawing. We saw the difference between primary and secondary command buffers.

We talked about the importance of the buffer’s allocation strategy to ensure safety when recording commands in parallel, especially taking into consideration command reuse between frames.

Finally, we showed step by step how to use both types of command buffers, and this should be enough to add the desired level of parallelism to any application that decides to use Vulkan as its graphics API.

In the next chapter, we will work on a data structure called Frame Graph, which will...

Further reading

Task-based systems have been in use for many years. https://www.gdcvault.com/play/1012321/Task-based-Multithreading-How-to provides a good overview.

Many articles can be found that cover work-stealing queues at https://blog.molecular-matters.com/2015/09/08/job-system-2-0-lock-free-work-stealing-part-2-a-specialized-allocator/ and are a good starting point on the subject.

The PlayStation 3 and Xbox 360 use the Cell processor from IBM to provide more performance to developers through multiple cores. In particular, the PlayStation 3 has several synergistic processor units (SPUs) that developers can use to offload work from the main processor.

There are many presentations and articles that detail many clever ways developers have used these processors, for example, https://www.gdcvault.com/play/1331/The-PlayStation-3-s-SPU and https://gdcvault.com/play/1014356/Practical-Occlusion-Culling-on.

lock icon The rest of the chapter is locked
You have been reading a chapter from
Mastering Graphics Programming with Vulkan
Published in: Feb 2023 Publisher: Packt ISBN-13: 9781803244792
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}