The Modern Vulkan Cookbook

Working with Modern Vulkan

The goal of this chapter is to show you how to render a scene that accepts input information, such as textures and uniform data, from the application side. This chapter will cover advanced topics in the Vulkan API that build upon the core concepts discussed in the previous chapter and present all the information you need to render complex scenes, along with newer features of the API. Additionally, the chapter will demonstrate techniques to enhance the rendering speed.

In this chapter, we’re going to cover the following recipes:

Understanding Vulkan’s memory model
Instantiating the VMA library
Creating buffers
Uploading data to buffers
Creating a staging buffer
How to avoid data races using ring buffers
Setting up pipeline barriers
Creating images (textures)
Creating an image view
Creating a sampler
Providing shader data
Customizing shader behavior with specialization constants
Implementing MDI and PVP
Adding flexibility to the rendering pipeline using dynamic rendering
Transferring resources between queue families

Understanding Vulkan’s memory model

Memory allocation and management are crucial in Vulkan, as almost none of the details of memory usage are managed by Vulkan. Except for deciding the exact memory address where memory should be allocated, all other details are the responsibility of the application. This means the programmer must manage memory types, their sizes, and alignments, as well as any sub-allocations. This approach gives applications more control over memory management and allows developers to optimize their programs for specific uses. This recipe will provide some fundamental information about the types of memory provided by the API as well as a summary of how to allocate and bind that memory to resources.

Getting ready

Graphics cards come in two variants, integrated and discrete. Integrated graphics cards share the same memory as the CPU, as shown in Figure 2.1:

Figure 2.1 – Typical memory architecture for discrete graphics cards

Discrete graphics cards have their own memory (device memory) separate from the main memory (host memory), as shown in Figure 2.2:

Figure 2.2 – Typical memory architecture for integrated graphics cards

Vulkan provides different types of memory:

Device-local memory: This type of memory is optimized for use by the GPU and is local to the device. It is typically faster than host-visible memory but is not accessible from the CPU. Usually, resources such as render targets, storage images, and buffers are stored in this memory.
Host-visible memory: This type of memory is accessible from both the GPU and the CPU. It is typically slower than device-local memory but allows for efficient data transfer between the GPU and CPU. Reads from GPU to CPU happen across Peripheral Component Interconnect Express (PCI-E) lanes in the case of non-integrated GPU. It’s typically used to set up staging buffers, where data is stored before being transferred to device-local memory, and uniform buffers, which are constantly updated from the application.
Host-coherent memory: This type of memory is like host-visible memory but provides guaranteed memory consistency between the GPU and CPU. This type of memory is typically slower than both device-local and host-visible memory but is useful for storing data that needs to be frequently updated by both the GPU and CPU.

Figure 2.3 summarizes the three aforementioned types of memory. Device-local memory is not visible from the host, while host-coherent and host-visible are. Copying data from the CPU to the GPU can be done using mapped memory for those two types of memory allocations. For device-local memory, it’s necessary to copy the data from the CPU to host-visible memory first using mapped memory (the staging buffer), and then perform a copy of the data from the staging buffer to the destination, the device-local memory, using a Vulkan function:

Figure 2.3 – Types of memory and their visibility from the application in Vulkan

Images are usually device-local memory, as they have their own layout that isn’t readily interpretable by the application. Buffers can be of any one of the aforementioned types.

How to do it…

A typical workflow for creating and uploading data to a buffer includes the following steps:

Create a buffer object of type VkBuffer by using the VkBufferCreateInfo structure and calling vkCreateBuffer.
Retrieve the memory requirements based on the buffer’s properties by calling vkGetBufferMemoryRequirements. The device may require a certain alignment, which could affect the necessary size of the allocation to accommodate the buffer’s contents.
Create a structure of type VkMemoryAllocateInfo, specify the size of the allocation and the type of memory, and call vkAllocateMemory.
Call vkBindBufferMemory to bind the allocation with the buffer object.
If the buffer is visible from the host, map a pointer to the destination with vkMapMemory, copy the data, and unmap the memory with vkUnmapMemory.
If the buffer is a device-local buffer, copy the data to a staging buffer first, then perform the final copy from the staging buffer to the device-local memory using the vkCmdCopyBuffer function.

As you can see, that’s a complex procedure that can be simplified by using the VMA library, an open source library that provides a convenient and efficient way to manage memory in Vulkan. It offers a high-level interface that abstracts the complex details of memory allocation, freeing you from the burden of manual memory management.

Creating buffers

A buffer in Vulkan is simply a contiguous block of memory that holds some data. The data can be vertex, index, uniform, and more. A buffer object is just metadata and does not directly contain data. The memory associated with a buffer is allocated after a buffer has been created.

Table 2.1 summarizes the most important usage types of buffers and their access type:

Buffer Type	Access Type	Uses
Vertex or Index	Read-only
Uniform	Read-only	Uniform data storage
Storage	Read/write	Generic data storage
Uniform texel	Read/write	Data is interpreted as texels
Storage texel	Read/write	Data is interpreted as texels

Table 2.1 – Buffer types

Creating buffers is easy, but it helps to know what types of buffers exist and what their requirements are before setting out to create them. In this chapter, we will provide a template for creating buffers.

Getting ready

In the repository, Vulkan buffers are managed by the VulkanCore::Buffer class, which provides functions to create and upload data to the device, as well as a utility function to use a staging buffer to upload data to device-only heaps.

How to do it…

Creating a buffer using VMA is simple:

All you need are buffer creation flags ( –a value of 0 for the flags is correct for most cases), the size of the buffer in bytes, its usage (this is how you define how the buffer will be used), and assign those values to an instance of the VkBufferCreateInfo structure:

VkDeviceSize size;  // The requested size of the buffer
VmaAllocator allocator;  // valid VMA Allocator
VkUsageBufferFlags use;  // Transfer src/dst/uniform/SSBO
VkBuffer buffer;        // The created buffer
VkBufferCreateInfo createInfo = {
    .sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO,
    .pNext = nullptr,
    .flags = {},
    .size = size,
    .usage = use,
    .sharingMode = VK_SHARING_MODE_EXCLUSIVE,
    .queueFamilyIndexCount = {},
    .pQueueFamilyIndices = {},
};

You will also need a set of VmaAllocationCreateFlagBits values:

const VmaAllocationCreateFlagBits allocCreateInfo = {
    VMA_ALLOCATION_CREATE_MAPPED_BIT,
    VMA_MEMORY_USAGE_CPU_ONLY,
};

Then, call vmaCreateBuffer to obtain the buffer handle and its allocation:

VmaAllocation allocation;  // Needs to live until the
                           // buffer is destroyed
VK_CHECK(vmaCreateBuffer(allocator, &createInfo,
                         &allocCreateInfo, &buffer,
                         &allocation, nullptr));

The next step is optional but useful for debugging and optimization:

VmaAllocationInfo allocationInfo;
vmaGetAllocationInfo(allocator, allocation,
                     &allocationInfo);

Some creation flags affect how the buffer can be used, so you might need to make adjustments to the preceding code depending on how you intend to use the buffers you create in your application.

Uploading data to buffers

Uploading data from the application to the GPU depends on the type of buffer. For host-visible buffers, it’s a direct copy using memcpy. For device-local buffers, we need a staging buffer, which is a buffer that is visible both by the CPU and the GPU. In this recipe, we will demonstrate how to upload data from your application to the device-visible memory (into a buffer’s memory region on the device).

Getting ready

If you haven’t already, please refer to the Understanding Vulkan’s memory model recipe.

How to do it…

The upload process depends on the type of buffer:

For host-visible memory, it’s enough to retrieve a pointer to the destination using vmaMapMemory and copy the data using memcpy. The operation is synchronous, so the mapped pointer can be unmapped as soon as memcpy returns.

It’s fine to map a host-visible buffer as soon as it is created and leave it mapped until its destruction. That is the recommended approach, as you don’t incur the overhead of mapping the memory every time it needs to be updated:

VmaAllocator allocator;   // Valid VMA allocator
VmaAllocation allocation; // Valid VMA allocation
void *data;               // Data to be uploaded
size_t size;              // Size of data in bytes
void *map = nullptr;
VK_CHECK(vmaMapMemory(allocator, allocation,
                      &map));
memcpy(map, data, size);
vmaUnmapMemory(allocator_, allocation_);
VK_CHECK(vmaFlushAllocation(allocator_,
                            allocation_, offset,
                            size));

Uploading data to a device-local memory needs to be (1) copied to a buffer that is visible from the host first (called a staging buffer) and then (2) copied from the staging buffer to the device-local memory using vkCmdCopyBuffer, as depicted in Figure 2.4. Note that this requires a command buffer:

Figure 2.4 – Staging buffers

Once the data is residing on the device (on the host-visible buffer), copying it to the device-only buffer is simple:

VkDeviceSize srcOffset;
VkDeviceSize dstOffset;
VkDeviceSize size;
VkCommandBuffer commandBuffer; // Valid Command Buffer
VkBuffer stagingBuffer; // Valid host-visible buffer
VkBuffer buffer; // Valid device-local buffer
VkBufferCopy region(srcOffset, dstOffset, size);
vkCmdCopyBuffer(commandBuffer, stagingBuffer, buffer, 1, &region);

Uploading data from your application to a buffer is accomplished either by a direct memcpy operation or by means of a staging buffer. We showed how to perform both uploads in this recipe.

How to avoid data races using ring buffers

When a buffer needs to be updated every frame, we run the risk of creating a data race, as shown in Figure 2.5. A data race is a situation where multiple threads within a program concurrently access a shared data point, with at least one thread performing a write operation. This concurrent access can result in unforeseen behavior due to the unpredictable order of operations. Take the example of a uniform buffer that stores the view, model, and viewport matrices and needs to be updated every frame. The buffer is updated while the first command buffer is being recorded, initializing it (version 1). Once the command buffer starts processing on the GPU, the buffer contains the correct data:

Figure 2.5 – Data race when using one buffer

After the first command buffer starts processing in the GPU, the application may try to update the buffer’s contents to version 2 while the GPU is accessing that data for rendering!

Getting ready

Synchronization is by far the hardest aspect of Vulkan. If synchronization elements such as semaphores, fences, and barriers are used too greedily, then your application becomes a series and won’t use the full power of the parallelism between the CPU and the GPU.

Make sure you also read the Understanding synchronization in the swapchain – fences and semaphores recipe in Chapter 1, Vulkan Core Concepts. That recipe and this one only scratch the surface of how to tackle synchronization, but are very good starting points.

A ring-buffer implementation is provided in the EngineCore::RingBuffer repository, which has a configurable number of sub-buffers. Its sub-buffers are all host-visible, persistent buffers; that is, they are persistently mapped after creation for ease of access.

How to do it…

There are a few ways to avoid this problem, but the easiest one is to create a ring buffer that contains several buffers (or any other resource) equal to the number of frames in flight. Figure 2.6 shows events when there are two buffers available. Once the first command buffer is submitted and is being processed in the GPU, the application is free to process copy 1 of the buffer, as it’s not being accessed by the device:

Figure 2.6 – A data race is avoided with multiple copies of a resource

Even though this is a simple solution, it has a caveat: if partial updates are allowed, care must be taken when the buffer is updated. Consider Figure 2.7, in which a ring buffer that contains three sub-allocations is partially updated. The buffer stores the view, model, and viewport matrices. During initialization, all three sub-allocations are initialized to three identity matrices. On Frame 0, while Buffer 0 is active, the model matrix is updated and now contains a translation of (10, 10, 0). On the next frame, Frame 1, Buffer 1 becomes active, and the viewport matrix is updated. Because Buffer 1 was initialized to three identity matrices, updating only the viewport matrix makes buffers 0 and 1 out of sync (as well as Buffer 3). To guarantee that partial updates work, we need to copy the last active buffer, Buffer 0, into Buffer 1 first, and then update the viewport matrix:

Figure 2.7 – Partial update of a ring buffer makes all sub-allocations out of sync if they are not replicated

Synchronization is a delicate topic, and guaranteeing your application behaves correctly with so many moving parts is tricky. Hopefully, a ring-buffer implementation that is simple may help you focus on other areas of the code.

Setting up pipeline barriers

In Vulkan, commands may be reordered when a command buffer is being processed, subject to certain restrictions. This is known as command buffer reordering, and it can help to improve performance by allowing the driver to optimize the order in which commands are executed.

The good news is that Vulkan provides a mechanism called pipeline barriers to ensure that dependent commands are executed in the correct order. They are used to explicitly specify dependencies between commands, preventing them from being reordered, and at what stages they might overlap. This recipe will explain what pipeline barriers are and what their properties mean. It will also show you how to create and install pipeline barriers.

Getting ready

Consider two draw calls issued in sequence. The first one writes to a color attachment, while the second draw call samples from that attachment in the fragment shader:

vkCmdDraw(...); // draws into color attachment 0
vkCmdDraw(...); // reads from color attachment 0

Figure 2.8 helps visualize how those two commands may be processed by the device. In the diagram, commands are processed from top to bottom and progress on the pipeline from left to right. Clock cycles are a loose term, because processing may take multiple clock cycles, but are used to indicate that – in general – some tasks must happen after others.

In the example, the second vkCmdDraw call starts executing at C2, after the first draw call. This offset is not enough, as the second draw call needs to read the color attachment at the Fragment Shader stage, which is not produced by the first draw call until it reaches the Color Attach Output stage. Without synchronization, this setup may cause data races:

Figure 2.8 – Two consecutive commands recorded on the same command buffer being processed without synchronization

A pipeline barrier is a feature that is recorded into the command buffer and that specifies the pipeline stages that need to have been completed for all commands that appear before the barrier and before the command buffer continues processing. Commands recorded before the barrier are said to be in the first synchronization scope or first scope. Commands recorded after the barrier are said to be part of the second synchronization scope or second scope.

The barrier also allows fine-grained control to specify at which stage commands after the barrier must wait until commands in the first scope finish processing. That’s because commands in the second scope don’t need to wait until commands in the first scope are done. They can start processing as soon as possible, as long as the conditions specified in the barrier are met.

In the example in Figure 2.8, the first draw call, in the first scope, needs to write to the attachment before the second draw call can access it. The second draw call does not need to wait until the first draw call finishes processing the Color Attach Output stage. It can start right away, as long as its fragment stage happens after the first draw call is done with its Color Attach Output stage, as shown in Figure 2.9:

Figure 2.9 – Two consecutive commands recorded on the same command buffer being processed with synchronization

There are three types of barriers:

Memory barriers are global barriers and apply to all commands in the first and second scopes.
Buffer memory barriers are barriers that apply only to commands that access a portion of the buffer, as it’s possible to specify to which portion of the buffer the barrier applies (offset + range).
Image memory barriers are barriers that apply only to commands that access a subresource of an image. It’s possible to add barriers based on mip level, sections of the image, or array layers. This is an especially important barrier as it is also used to transition an image from one layout to another. For instance, while generating mipmaps and blitting from one mip level to the next, the levels need to be in the correct layout. The previous level needs to be in the VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL layout, as it will be read from, while the next mip level needs to be in the VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL layout, as it will be written to.

How to do it…

Pipeline barriers are recorded with the vkCmdPipelineBarrier command, in which you can provide several barriers of multiple types at the same time. The following code snippet shows how to create a barrier used to create a dependency between the two draw calls in Figure 2.9:

VkCommandBuffer commandBuffer;  // Valid Command Buffer
VkImage image;                  // Valid image
const VkImageSubresourceRange subresource = {
    .aspectMask =.baseMipLevel = 0,
    .levelCount = VK_REMAINING_MIP_LEVELS,
    .baseArrayLayer = 0,
    .layerCount = 1,
};
const VkImageMemoryBarrier imageBarrier = {
    .sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER,
    .srcAccessMask =
        VK_ACCESS_2_COLOR_ATTACHMENT_WRITE_BIT_KHR,
    .dstAccessMask = VK_ACCESS_2_SHADER_READ_BIT_KHR,
    .oldLayout = VK_IMAGE_LAYOUT_ATTACHMENT_OPTIMAL,
    .newLayout = VK_IMAGE_LAYOUT_READ_ONLY_OPTIMAL,
    .srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED,
    .dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED,
    .image = image,
    .subresourceRange = &subresource,
};
vkCmdPipelineBarrier(
    commandBuffer,
    VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
    VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, 0, 0,
    nullptr, 0, nullptr, 1, &memoryBarrier);

The barrier needs to be recorded between the two draw calls:

vkCmdDraw(...); // draws into color attachment 0
vkCmdPipelineBarrier(...);
vkCmdDraw(...); // reads from color attachment 0

Pipeline barriers are tricky but absolutely fundamental in Vulkan. Make sure you understand what they offer and how they operate before continuing to read the other recipes.

Creating images (textures)

Images are used for storing 1D, 2D, or 3D data, although they are mostly used for 2D data. Different than buffers, images have the advantage of being optimized for locality in memory layout. This is because most GPUs have a fixed-function texture unit or sampler that reads texel data from an image and applies filtering and other operations to produce a final color value. Images can have different formats, such as RGB, RGBA, BGRA, and so on.

An image object is only metadata in Vulkan. Its data is stored separately and is created in a similar manner to buffers (Figure 2.10):

Figure 2.10 – Images

Images in Vulkan cannot be accessed directly and need to be accessed only by means of an image view. An image view is a way to access a subset of the image data by specifying the subresource range, which includes the aspect (such as color or depth), the mip level, and the array layer range.

Another very important aspect of images is their layout. It is used to specify the intended usage of an image resource in Vulkan, such as whether it should be used as a source or destination for a transfer operation, a color or depth attachment for rendering, or as a shader read or write resource. The correct image layout is important because it ensures that the GPU can efficiently access and manipulate the image data in accordance with the intended usage. Using the wrong image layout can lead to performance issues or rendering artifacts and can result in undefined behavior. Therefore, it’s essential to correctly specify the image layout for each usage of an image in a Vulkan application. Common image layouts are undefined (VK_IMAGE_LAYOUT_UNDEFINED) color attachment (VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL), depth/stencil attachment (VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL), and shader read(VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL). Image layout transitions are done as part of the vkCmdPipelineBarrier command.

In this recipe, you will learn how to create images on a device.

Getting ready

In the VulkanCore::Texture class within our repository, we’ve encapsulated the intricate management of images and image views, offering a comprehensive solution for handling Vulkan textures. From facilitating efficient data uploads to handling transitions between image layouts and generating mipmaps, the Texture class equips us with the means to seamlessly integrate textures in the Vulkan examples.

How to do it…

Creating an image requires some basic information about it, such as type (1D, 2D, 3D), size, format (RGBA, BGRA, and so on), number of mip levels, number of layers (faces for cubemaps), and a few others:

VkFormat format;     // Image format
VkExtents extents;   // Image size
uint32_t mipLevels;  // Number of mip levels
uint32_t layerCount; // Number of layers (sides of cubemap)
const VkImageCreateInfo imageInfo = {
    .sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO,
    .flags = 0, // optional
    .imageType = VK_IMAGE_TYPE_2D,  // 1D, 2D, 3D
    .format = format,
    .extent = extents,
    .mipLevels = mipLevels,
    .arrayLayers = layerCount,
    .samples = VK_SAMPLE_COUNT_1_BIT,
    .tiling = VK_IMAGE_TILING_OPTIMAL,
    .usage = VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT,
    .sharingMode = VK_SHARING_MODE_EXCLUSIVE,
    .initialLayout = VK_IMAGE_LAYOUT_UNDEFINED,
};

The following structure tells VMA that the image will be a device-only image:

const VmaAllocationCreateInfo allocCreateInfo = {
    .flags = VMA_ALLOCATION_CREATE_DEDICATED_MEMORY_BIT,
    .usage = VMA_MEMORY_USAGE_AUTO_PREFER_DEVICE,
    .priority = 1.0f,
};

The resulting image’s handle will be stored in image:

VkImage image = VK_NULL_HANDLE;
VK_CHECK(vmaCreateImage(vmaAllocator_, &imageInfo,
                        &allocCreateInfo, &image,
                        &vmaAllocation_, nullptr));

The next step is optional but useful for debugging or optimizing the code:

VmaAllocationInfo allocationInfo;
vmaGetAllocationInfo(vmaAllocator_, vmaAllocation_,
                     &allocationInfo);

This recipe only showed you how to create an image in Vulkan, not how to upload data to it. Uploading data to an image is just like uploading data to a buffer.

Creating a sampler

A sampler in Vulkan transcends a simple object; it’s a crucial bridge between shader execution and image data. Beyond interpolation, it governs filtering, addressing modes, and mipmapping. Filters dictate interpolation between texels, while addressing modes control how coordinates map to image extents. Anisotropic filtering further enhances sampling fidelity. Mipmapping, a pyramid of downsampled image levels, is another facet managed by samplers. In essence, creating a sampler involves orchestrating these attributes to seamlessly harmonize image data and shader intricacies. In this recipe, you will learn how to create a sampler object in Vulkan.

Getting ready

Samplers are implemented by the VulkanCore::Sampler class in the repository.

How to do it…

The properties of a sampler define how an image is interpreted in the pipeline, usually in a shader. The process is simple – instantiate a VkSamplerCreateInfo structure and call vkCreateSampler:

VkDevice device;  // Valid Vulkan Device
VkFilter minFilter;
VkFilter maxFilter;
float maxLod;  // Max mip level
const VkSamplerCreateInfo samplerInfo = {
    .sType = VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO,
    .magFilter = minFilter,
    .minFilter = magFilter,
    .mipmapMode = maxLod > 0
                      ? VK_SAMPLER_MIPMAP_MODE_LINEAR
                      : VK_SAMPLER_MIPMAP_MODE_NEAREST,
    .addressModeU = VK_SAMPLER_ADDRESS_MODE_REPEAT,
    .addressModeV = VK_SAMPLER_ADDRESS_MODE_REPEAT,
    .addressModeW = VK_SAMPLER_ADDRESS_MODE_REPEAT,
    .mipLodBias = 0,
    .anisotropyEnable = VK_FALSE,
    .minLod = 0,
    .maxLod = maxLod,
};
VkSampler sampler{VK_NULL_HANDLE};
VK_CHECK(vkCreateSampler(device, &samplerInfo, nullptr,
                         &sampler));

A sampler is one of the simplest objects to create in Vulkan and one of the easiest to understand, as it describes very common computer graphics concepts.

Providing shader data

Providing data from your application that will be used in shaders is one of the most convoluted aspects of Vulkan and requires several steps that need to be accomplished in the right order (and with the right parameters). In this recipe, with many smaller recipes, you will learn how to provide data used in shaders, such as textures, buffers, and samplers.

Getting ready

Resources consumed by shaders are specified using the layout keyword, along with set and binding qualifiers:

layout(set = 0, binding=0) uniform Transforms
{
    mat4 model;
    mat4 view;
    mat4 projection;
} MVP;

Each resource is represented by a binding. A set is a collection of bindings. One binding doesn’t necessarily represent just one resource; it can also represent an array of resources of the same type.

How to do it…

Providing a resource as input to shaders is a multi-step process that involves the following:

Specifying sets and their bindings using descriptor set layouts. This step doesn’t associate real resources with sets/bindings. It just specifies the number and types of bindings in a set.
Building a pipeline layout, which describes which sets will be used in a pipeline.
Creating a descriptor pool that will provide instances of descriptor sets. A descriptor pool contains a list of how many bindings it can provide grouped by binding type (texture, sampler, shader storage buffer (SSBO), uniform buffers).
Allocate descriptor sets from the pool with vkAllocateDescriptorSets.
Bind resources to bindings using vkUpdateDescriptorSets. In this step, we associate a real resource (a buffer, a texture, and so on) with a binding.
Bind descriptor sets and their bindings to a pipeline during rendering using vkCmdBindDescriptorSet. This step makes resources bound to their set/bindings in the previous step available to shaders in the current pipeline.

The next recipes will show you how to perform each one of those steps.

Specifying descriptor sets with descriptor set layouts

Consider the following GLSL code, which specifies several resources:

struct Vertex {
    vec3 pos;
    vec2 uv;
    vec3 normal;
};
layout(set = 0, binding=0) uniform Transforms
{
    mat4 model;
    mat4 view;
    mat4 projection;
} MVP;
layout(set = 1, binding = 0) uniform texture2D textures[];
layout(set = 1, binding = 1) uniform sampler   samplers[];
layout(set = 2, binding = 0) readonly buffer VertexBuffer
{
    Vertex vertices[];
} vertexBuffer;

The code requires three sets (0, 1, and 2), so we need to create three descriptor set layouts. In this recipe, you will learn how to create a descriptor set layout for the preceding code.

Getting ready

Descriptor sets and bindings are created, stored, and managed by the VulkanCore::Pipeline class in the repository. A descriptor set in Vulkan acts as a container that holds resources, such as buffers, textures, and samplers, for use by shaders. Binding refers to the process of associating these descriptor sets with specific shader stages, enabling seamless interaction between shaders and resources during rendering. These descriptor sets serve as gateways through which resources are seamlessly bound to shader stages, orchestrating harmony between data and shader execution. To facilitate this synergy, the class simplifies descriptor set creation and management, complemented by methods for efficient resource binding within the Vulkan rendering pipeline.

How to do it…

A descriptor set layout states its bindings (number and types) with the vkDescriptorSetLayout structure. Each binding is described using an instance of the vkDescriptorSetLayoutBinding structure. The relationship between the Vulkan structures needed to create a descriptor set layout for the preceding code is shown in Figure 2.11:

Figure 2.11 – Illustrating the configuration of descriptor set layouts for GLSL shaders

The following code shows how to specify two bindings for set 1, which are stored in a vector of bindings:

constexpr uint32_t kMaxBindings = 1000;
const VkDescriptorSetLayoutBinding texBinding = {
    .binding = 0,
    .descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE,
    .descriptorCount = kMaxBindings,
    .stageFlags = VK_SHADER_STAGE_VERTEX_BIT,
};
const VkDescriptorSetLayoutBinding samplerBinding = {
    .binding = 1,
    .descriptorType = VK_DESCRIPTOR_TYPE_SAMPLER,
    .descriptorCount = kMaxBindings,
    .stageFlags = VK_SHADER_STAGE_VERTEX_BIT,
};
struct SetDescriptor {
  uint32_t set_;
  std::vector<VkDescriptorSetLayoutBinding> bindings_;
};
std::vector<SetDescriptor> sets(1);
sets[0].set_ = 1;
sets[0].bindings_.push_back(texBinding);
sets[0].bindings_.push_back(samplerBinding);

Since each binding describes a vector, and the VkDescriptorSetLayoutBinding structure requires the number of descriptors, we are using a large number that hopefully will accommodate all elements we need in the array. The vector of bindings is stored in a structure that describes a set with its number and all its bindings. This vector will be used to create a descriptor set layout:

constexpr VkDescriptorBindingFlags flagsToEnable =
    VK_DESCRIPTOR_BINDING_PARTIALLY_BOUND_BIT |
    VK_DESCRIPTOR_BINDING_UPDATE_UNUSED_WHILE_PENDING_BIT;
for (size_t setIndex = 0;
     const auto& set : sets) {
  std::vector<VkDescriptorBindingFlags> bindFlags(
      set.bindings_.size(), flagsToEnable);
  const VkDescriptorSetLayoutBindingFlagsCreateInfo
      extendedInfo{
          .sType =
              VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_BINDING_FLAGS_CREATE_INFO,
          .pNext = nullptr,
          .bindingCount = static_cast<uint32_t>(
              set.bindings_.size()),
          .pBindingFlags = bindFlags.data(),
      };
  const VkDescriptorSetLayoutCreateInfo dslci = {
      .sType =
          VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO,
      .pNext = &extendedInfo,
      .flags =
          VK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT,
      .bindingCount =
          static_cast<uint32_t>(set.bindings_.size()),
      .pBindings = set.bindings_.data(),
  };
  VkDescriptorSetLayout descSetLayout{VK_NULL_HANDLE};
  VK_CHECK(vkCreateDescriptorSetLayout(
      context_->device(), &dslci, nullptr,
      &descSetLayout));
}

Each set requires its own descriptor set layout, and the preceding process needs to be repeated for each one. The descriptor set layout needs to be stored so that it can be referred to in the future.

Passing data to shaders using push constants

Push constants are another way to pass data to shaders. Although a very performant and easy way to do so, push constants are very limited in size, 128 bytes being the only guaranteed amount by the Vulkan specification.

This recipe will show you how to pass a small amount of data from your application to shaders, using push constants for a simple shader.

Getting ready

Push constants are stored and managed by the VulkanCore::Pipeline class.

How to do it…

Push constants are recorded directly onto the command buffer and aren’t prone to the same synchronization issues that exist with other resources. They are declared in the shader as follows, with one maximum block per shader:

layout (push_constant) uniform Transforms {
    mat4 model;
} PushConstants;

The pushed data must be split into the shader stages. Parts of it can be assigned to different shader stages or assigned to one single stage. The important part is that the data cannot be greater than the total amount available for push constants. The limit is provided in VkPhysicalDeviceLimits::maxPushConstantsSize.

Before using push constants, we need to specify how many bytes we are using in each shader stage:

const VkPushConstantRange range = {
    .stageFlags = VK_SHADER_STAGE_VERTEX_BIT,
    .offset = 0,
    .size = 64,
};
std::vector<VkPushConstantRange> pushConsts;
pushConsts.push_back(range);

The code states that the first (offset == 0) 64 bytes of the push constant data recorded in the command buffer (the size of a 4x4 matrix of floats) will be used by the vertex shader. This structure will be used in the next recipe to create a pipeline layout object.

Creating a pipeline layout

A pipeline layout is an object in Vulkan that needs to be created and destroyed by the application. The layout is specified using structures that define the layout of bindings and sets. In this recipe, you will learn how to create a pipeline layout.

Getting ready

A VkPipelineLayoutCreateInfo instance is created automatically by the VulkanCore::Pipeline class in the repository based on information provided by the application using a vector of VulkanCore::Pipeline::SetDescriptor structures.

How to do it…

With all descriptor set layouts for all sets and the push constant information in hand, the next step consists of creating a pipeline layout:

std::vector<VkDescriptoSetLayout> descLayouts;
const VkPipelineLayoutCreateInfo pipelineLayoutInfo = {
    .sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,
    .setLayoutCount = (uint32_t)descLayouts.size(),
    .pSetLayouts = descLayouts.data(),
    .pushConstantRangeCount =
        !pushConsts.empty()
            ? static_cast<uint32_t>(pushConsts.size())
            : 0,
    .pPushConstantRanges = !pushConsts.empty()
                               ? pushConsts.data()
                               : nullptr,
};
VkPipelineLayout pipelineLayout{VK_NULL_HANDLE};
VK_CHECK(vkCreatePipelineLayout(context_->device(),
                                &pipelineLayoutInfo,
                                nullptr,
                                &pipelineLayout));

Once you have the descriptor set layout in hand and know how to use the push constants in your application, creating a pipeline layout is straightforward.

Creating a descriptor pool

A descriptor pool contains a maximum number of descriptors it can provide (be allocated from), grouped by binding type. For instance, if two bindings of the same set require one image each, the descriptor pool would have to provide at least two descriptors. In this recipe, you will learn how to create a descriptor pool.

Getting ready

Descriptor pools are allocated in the VulkanCore::Pipeline:: initDescriptorPool() method.

How to do it…

Creating a descriptor pool is straightforward. All we need is a list of binding types and the maximum number of resources we’ll allocate for each one:

constexpr uint32_t swapchainImages = 3;
std::vector<VkDescriptorPoolSize> poolSizes;
poolSizes.emplace_back(VkDescriptorPoolSize{
    VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE,
    swapchainImages* kMaxBindings});
poolSizes.emplace_back(VkDescriptorPoolSize{
    VK_DESCRIPTOR_TYPE_SAMPLER,
    swapchainImages* kMaxBindings});

Since we duplicate the resources based on the number of swapchain images to avoid data races between the CPU and the GPU, we multiply the number of bindings we requested before (kMaxBindings = 1000) by the number of swapchain images:

const VkDescriptorPoolCreateInfo descriptorPoolInfo = {
    .sType =
        VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO,
    .flags =
        VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT |
        VK_DESCRIPTOR_POOL_CREATE_UPDATE_AFTER_BIND_BIT,
    .maxSets = MAX_DESCRIPTOR_SETS,
    .poolSizeCount =
        static_cast<uint32_t>(poolSizes.size()),
    .pPoolSizes = poolSizes.data(),
};
VkDescriptorPool descriptorPool{VK_NULL_HANDLE};
VK_CHECK(vkCreateDescriptorPool(context_->device(),
                                &descriptorPoolInfo,
                                nullptr,
                                &descriptorPool));

Be careful not to create pools that are too large. Achieving a high-performing application means not allocating more resources than you need.

Allocating descriptor sets

Once a descriptor layout and a descriptor pool have been created, before you can use it, you need to allocate a descriptor set, which is an instance of a set with the layout described by the descriptor layout. In this recipe, you will learn how to allocate a descriptor set.

Getting ready

Descriptor set allocations are done in the VulkanCore::Pipeline:: allocateDescriptors() method. Here, developers define the count of descriptor sets required, coupled with binding counts per set. The subsequent bindDescriptorSets() method weaves the descriptors into command buffers, preparing them for shader execution.

How to do it…

Allocating a descriptor set (or a number of them) is easy. You need to fill the VkDescriptorSetAllocateInfo structure and call vkAllocateDescriptorSets:

VkDescriptorSetAllocateInfo allocInfo = {
    .sType =
        VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO,
    .descriptorPool = descriptorPool,
    .descriptorSetCount = 1,
    .pSetLayouts = &descSetLayout,
};
VkDescriptorSet descriptorSet{VK_NULL_HANDLE};
VK_CHECK(vkAllocateDescriptorSets(context_->device(),
                                  &allocInfo,
                                  &descriptorSet));

When using multiple copies of a resource to avoid race conditions, there are two approaches:

Allocate one descriptor set for each resource. In other words, call the preceding code once for each copy of the resource.
Create one descriptor set and update it every time you need to render.

Updating descriptor sets during rendering

Once a descriptor set has been allocated, it is not associated with any resources. This association must happen once (if your descriptor sets are immutable) or every time you need to bind a different resource to a descriptor set. In this recipe, you will learn how to update descriptor sets during rendering and after you have set up the pipeline and its layout.

Getting ready

In the repository, VulkanCore::Pipeline provides methods to update different types of resources, as each binding can only be associated with one type of resource (image, sampler, or buffer): updateSamplersDescriptorSets(), updateTexturesDescriptorSets(), and updateBuffersDescriptorSets().

How to do it…

Associating a resource with a descriptor set is done with the vkUpdateDescriptorSets function. Each call to vkUpdateDescriptorSets can update one or more bindings of one or more sets. Before updating a descriptor set, let’s look at how to update one binding.

You can associate either a texture, a texture array, a sampler, a sampler array, a buffer, or a buffer array with one binding. To associate images or samplers, use the VkDescriptorImageInfo structure. To associate buffers, use the VkDescriptorBufferInfo structure. Once one or more of those structures have been instantiated, use the VkWriteDescriptorSet structure to bind them all with a binding. Bindings that represent an array are updated with a vector of VkDescriptor*Info.

Consider the bindings declared in the shader code presented next:

layout(set = 1, binding = 0) uniform texture2D textures[];
layout(set = 1, binding = 1) uniform sampler   samplers[];
layout(set = 2, binding = 0) readonly buffer VertexBuffer
{
  Vertex vertices[];
} vertexBuffer;

To update the textures[] array, we need to create two instances of VkDescriptorImageInfo and record them in the first VkWriteDescriptorSet structure:

VkImageView imageViews[2];  // Valid Image View objects
VkDescriptorImageInfo texInfos[] = {
 VkDescriptorImageInfo{
  .imageView = imageViews[0],
  .imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL,
    },
 VkDescriptorImageInfo{
  .imageView = imageViews[1],
  .imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL,
 },
};
const VkWriteDescriptorSet texWriteDescSet = {
    .sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET,
    .dstSet = 1,
    ee,
    .dstArrayElement = 0,
    .descriptorCount = 2,
    .descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE,
    .pImageInfo = &texInfos,
    .pBufferInfo = nullptr,
};

The two image views will be bound to set 1 (.dstSet = 1) and binding 0 (.dstBinding = 0) as elements 0 and 1 of the array. If you need to bind more objects to the array, all you need are more instances of VkDescriptorImageInfo. The number of objects bound to the current binding is specified by the descriptorCount member of the structure.

The process is similar for sampler objects:

VkSampler sampler[2];  // Valid Sampler object
VkDescriptorImageInfo samplerInfos[] = {
    VkDescriptorImageInfo{
        .sampler = sampler[0],
    },
    VkDescriptorImageInfo{
        .sampler = sampler[1],
    },
};
const VkWriteDescriptorSet samplerWriteDescSet = {
    .sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET,
    .dstSet = 1,
    .dstBinding = 1,
    .dstArrayElement = 0,
    .descriptorCount = 2,
    .descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE,
    .pImageInfo = &samplerInfos,
    .pBufferInfo = nullptr,
};

This time, we are binding the sampler objects to set 1, binding 1. Buffers are bound using the VkDescriptorBufferInfo structure:

VkBuffer buffer;            // Valid Buffer object
VkDeviceSize bufferLength;  // Range of the buffer
const VkDescriptorBufferInfo bufferInfo = {
    .buffer = buffer,
    .offset = 0,
    .range = bufferLength,
};
const VkWriteDescriptorSet bufferWriteDescSet = {
  .sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET,
  .dstSet = 2,
  .dstBinding = 0,
  .dstArrayElement = 0,
  .descriptorCount = 1,
  .descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE,
  .pImageInfo = nullptr,
  .pBufferInfo = &bufferInfo,
};

Besides storing the address of the bufferInfo variable to the .pBufferInfo member of VkWriteDescriptorSet, we are binding one buffer (.descriptorCount = 1) to set 2 (.dstSet = 2) and binding 0 (.dstBinding = 0).

The last step consists of storing all VkWriteDescriptorSet instances in a vector and calling vkUpdateDescriptorSets:

VkDevice device; // Valid Vulkan Device
std::vector<VkWriteDescriptorSet> writeDescSets;
writeDescSets.push_back(texWriteDescSet);
writeDescSets.push_back(samplerWriteDescSet);
writeDescSets.push_back(bufferWriteDescSet);
vkUpdateDescriptorSets(device, static_cast<uint32_t>(writeDescSets.size()),
                      writeDescSets.data(), 0, nullptr);

Encapsulating this task is the best way to avoid repetition and bugs introduced by forgetting a step in the update procedure.

Passing resources to shaders (binding descriptor sets)

While rendering, we need to bind the descriptor sets we’d like to use during a draw call.

Getting ready

Binding sets is done with the VulkanCore::Pipeline:: bindDescriptorSets() method.

How to do it…

To bind a descriptor set for rendering, we need to call vkCmdBindDescriptorSets:

VkCommandBuffer commandBuffer;   // Valid Command Buffer
VkPipelineLayout pipelineLayout; // Valid Pipeline layout
uint32_t set;                    // Set number
VkDescriptorSet descSet;         // Valid Descriptor Set
vkCmdBindDescriptorSets(
    commandBuffer, VK_PIPELINE_BIND_POINT_GRAPHICS,
    pipelineLayout, set, 1u, &descSet, 0, nullptr);

Now that we’ve successfully bound a descriptor set for rendering, let’s turn our attention to another crucial aspect of our graphics pipeline: updating push constants.

Updating push constants during rendering

Push constants are updated during rendering by recording their values directly into the command buffer being recorded.

Getting ready

Updating push constants is done with the VulkanCore::Pipeline:: udpatePushConstants() method.

How to do it…

Once rendered, updating push constants is straightforward. All you need to do is call vkCmdPushConstants:

VkCommandBuffer commandBuffer;   // Valid Command Buffer
VkPipelineLayout pipelineLayout; // Valid Pipeline Layout
glm::vec4 mat;                   // Valid matrix
vkCmdPushConstants(commandBuffer, pipelineLayout,
                   VK_SHADER_STAGE_FRAGMENT_BIT, 0,
                   sizeof(glm::vec4), &mat);

This call records the contents of mat into the command buffer, starting at offset 0 and signaling that this data will be used by the vertex shader.

Customizing shader behavior with specialization constants

The process of compiling shader code results in immutability once completed. The compilation procedure carries a substantial time overhead and is generally circumvented during runtime. Even minor adjustments to a shader necessitate recompilation, leading to the creation of a fresh shader module and potentially a new pipeline as well – all entailing significant resource-intensive operations.

In Vulkan, specialization constants allow you to specify constant values for shader parameters at pipeline creation time, instead of having to recompile the shader with new values every time you want to change them. This can be particularly useful when you want to reuse the same shader with different constant values multiple times. In this recipe, we will delve deeper into the practical application of specialization constants in Vulkan to create more efficient and flexible shader programs, allowing you to adjust without the need for resource-intensive recompilations.

Getting ready

Specialization constants are available in the repository through the VulkanCore::Pipeline::GraphicsPipelineDescriptor structure. You need to provide a vector of VkSpecializationMapEntry structures for each shader type you’d like to apply specialization constants to.

How to do it…

Specialization constants are declared in GLSL using the constant_id qualifier along with an integer that specifies the constant’s ID:

layout (constant_id = 0) const bool useShaderDebug = false;

To create a pipeline with specialized constant values, you first need to create a VkSpecializationInfo structure that specifies the constant values and their IDs. You then pass this structure to the VkPipelineShaderStageCreateInfo structure when creating a pipeline:

const bool kUseShaderDebug = false;
const VkSpecializationMapEntry useShaderDebug = {
    .constantID = 0, // matches the constant_id qualifier
    .offset = 0,
    .size = sizeof(bool),
};
const VkSpecializationInfo vertexSpecializationInfo = {
    .mapEntryCount = 1,
    .pMapEntries = &useShaderDebug,
    .dataSize = sizeof(bool),
    .pData = &kUseShaderDebug,
};
const VkPipelineShaderStageCreateInfo shaderStageInfo = {
  ...
  .pSpecializationInfo = &vertexSpecializationInfo,
};

Because specialization constants are real constants, branches that depend on them may be entirely removed during the final compilation of the shader. On the other hand, specialization constants should not be used to control parameters such as uniforms, as they are not as flexible and require to be known during the construction of the pipeline.

Implementing MDI and PVP

MDI and PVP are features of modern graphics APIs that allow for greater flexibility and efficiency in vertex processing.

MDI allows issuing multiple draw calls with a single command, each of which derives its parameters from a buffer stored in the device (hence the indirect term). This is particularly useful because those parameters can be modified in the GPU itself.

With PVP, each shader instance retrieves its vertex data based on its index and instance IDs instead of being initialized with the vertex’s attributes. This allows for flexibility because the vertex attributes and their format are not baked into the pipeline and can be changed solely based on the shader code.

In the first sub-recipe, we will focus on the implementation of MDI, demonstrating how this powerful tool can streamline your graphics operations by allowing multiple draw calls to be issued from a single command, with parameters that can be modified directly in the GPU. In the following sub-recipe, we will guide you through the process of setting up PVP, highlighting how the flexibility of this feature can enhance your shader code by enabling changes to vertex attributes without modifying the pipeline.

Implementing MDI

For using MDI, we store all mesh data belonging to the scene in one big buffer for all the meshes’ vertices and another one for the meshes’ indices, with the data for each mesh stored sequentially, as depicted in Figure 2.12.

The drawing parameters are stored in an extra buffer. They must be stored sequentially, one for each mesh, although they don’t have to be provided in the same order as the meshes:

Figure 2.12 – MDI data layout

We will now learn how to implement MDI using the Vulkan API.

Getting ready

In the repository, we provide a utility function to decompose an EngineCore::Model object into multiple buffers suitable for an MDI implementation, called EngineCore::convertModel2OneBuffer(), located in GLBLoader.cpp.

How to do it…

Let’s begin by looking at the indirect draw parameters’ buffer.

The commands are stored following the same layout as the VkDrawIndexedIndirectCommand structure:

typedef struct VkDrawIndexedIndirectCommand {
    uint32_t    indexCount;
    uint32_t    instanceCount;
    uint32_t    firstIndex;
    int32_t     vertexOffset;
    uint32_t    firstInstance;
} VkDrawIndexedIndirectCommand;

indexCount specifies how many indices are part of this command and, in our case, is the number of indices for a mesh. One command reflects one mesh, so its instanceCount value is one. The firstVertex member is the index of the first index element in the buffer to use for this mesh, while vertexOffset points to the first vertex element in the buffer to use. An example with the correct offsets is shown in Figure 2.12.

Once the vertex, index, and indirect commands buffers are bound, calling vkCmdDrawIndexedIndirect consists of providing the buffer with the indirect commands and an offset into the buffer. The rest is done by the device:

VkCommandBuffer commandBuffer;  // Valid Command Bufer
VkBuffer indirectCmdBuffer;     // Valid buffer w/
                                // indirect commands
uint32_t meshCount;  // Number of indirect commands in
                     // the buffer
uint32_t offset = 0; // Offset into the indirect commands
                     // buffer
vkCmdDrawIndexedIndirect(
    commandBuffer, indirectCmdBuffer, offset,
    meshCount,
    sizeof(VkDrawIndexedIndirectDrawCommand));

In this recipe, we learned how to utilize vkCmdDrawIndexedIndirect, a key function in Vulkan that allows for high-efficiency drawing.

Using PVP

The PVP technique allows vertex data and their attributes to be extracted from buffers with custom code instead of relying on the pipeline to provide them to vertex shaders.

Getting ready

We will use the following structures to perform the extraction of vertex data – the Vertex structure, which encodes the vertex’s position (pos), normal, UV coordinates (uv), and its material index (material):

struct Vertex {
    vec3 pos;
    vec3 normal;
    vec2 uv;
    int material;
};

We will also use a buffer object, referred to in the shader as VertexBuffer:

layout(set = 2, binding = 0) readonly buffer VertexBuffer
{
    Vertex vertices[];
} vertexBuffer;

Next, we will learn how to use the vertexBuffer object to access vertex data.

How to do it…

The shader code used to access the vertex data looks like this:

void main() {
  Vertex vertex = vertexBuffer.vertices[gl_VertexIndex];
}

Note that the vertex and its attributes are not declared as inputs to the shader. gl_VertexIndex is automatically computed and provided to the shader based on the draw call and the parameters recorded in the indirect command retrieved from the indirect command buffer.

Index and vertex buffers

Note that both the index and vertex buffers are still provided and bound to the pipeline before the draw call is issued. The index buffer must have the VK_BUFFER_USAGE_INDEX_BUFFER_BIT flag enabled for the technique to work.

Adding flexibility to the rendering pipeline using dynamic rendering

In this recipe, we will delve into the practical application of dynamic rendering in Vulkan to enhance the flexibility of the rendering pipeline. We will guide you through the process of creating pipelines without the need for render passes and framebuffers and discuss how to ensure synchronization. By the end of this section, you will have learned how to implement this feature in your projects, thereby simplifying your rendering process by eliminating the need for render passes and framebuffers and giving you more direct control over synchronization.

Getting ready

To enable the feature, we must have access to the VK_KHR_get_physical_device_properties2 instance extension, instantiate a structure of type VkPhysicalDeviceDynamicRenderingFeatures, and set its dynamicRendering member to true:

const VkPhysicalDeviceDynamicRenderingFeatures dynamicRenderingFeatures = {
      .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DYNAMIC_RENDERING_FEATURES,
      .dynamicRendering = VK_TRUE,
  };

This structure needs to be plugged into the VkDeviceCreateInfo::pNext member when creating a Vulkan device:

const VkDeviceCreateInfo dci = {
    .sType = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO,
    .pNext = &dynamicRenderingFeatures,
    ...
};

Having grasped the concept of enabling dynamic rendering, we will now move forward and explore its implementation using the Vulkan API.

How to do it…

Instead of creating render passes and framebuffers, we must call the vkCmdBeginRendering command and provide the attachments and their load and store operations using the VkRenderingInfo structure. Each attachment (colors, depth, and stencil) must be specified with instances of the VkRenderingAttachmentInfo structure. Figure 2.13 presents a diagram of the structure participating in a call to vkCmdBeginRendering:

Figure 2.13 – Dynamic rendering structure diagram

Any one of the attachments, pColorAttachments, pDepthAttachment, and pStencilAttachment, can be null. Shader output written to location x is written to the color attachment at pColorAttachment[x].

Transferring resources between queue families

In this recipe, we will demonstrate how to transfer resources between queue families by uploading textures to a device from the CPU using a transfer queue and generating mip-level data in a graphics queue. Generating mip levels needs a graphics queue because it utilizes vkCmdBlitImage, supported only by graphics queues.

Getting ready

An example is provided in the repository in chapter2/mainMultiDrawIndirect.cpp, which uses the EngineCore::AsyncDataUploader class to perform texture upload and mipmap generation on different queues.

How to do it…

In the following diagram, we illustrate the procedure of uploading texture through a transfer queue, followed by the utilization of a graphics queue for mip generation:

Figure 2.14 – Recoding and submitting commands from different threads and transferring a resource between queues from different families

The process can be summarized as follows:

Record the commands to upload the texture to the device and add a barrier to release the texture from the transfer queue using the VkDependencyInfo and VkImageMemoryBarrier2 structures, specifying the source queue family as the family of the transfer queue and the destination queue family as the family of the graphics queue.
Create a semaphore and use it to signal when the command buffer finishes, and attach it to the submission of the command buffer.
Create a command buffer for generating mip levels and add a barrier to acquire the texture from the transfer queue into the graphics queue using the VkDependencyInfo and VkImageMemoryBarrier2 structures.

Attach the semaphore created in step 2 to the SubmitInfo structure when submitting the command buffer for processing. The semaphore will be signaled when the first command buffer has completed, allowing the mip-level-generation command buffer to start.

Two auxiliary methods will help us create acquire and release barriers for a texture. They exist in the VulkanCore::Texture class. The first one creates an acquire barrier:

void Texture::addAcquireBarrier(
    VkCommandBuffer cmdBuffer,
    uint32_t srcQueueFamilyIndex,
    uint32_t dstQueueFamilyIndex) {
  VkImageMemoryBarrier2 acquireBarrier = {
      .sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER_2,
      .dstStageMask =
          VK_PIPELINE_STAGE_2_FRAGMENT_SHADER_BIT,
      .dstAccessMask = VK_ACCESS_2_MEMORY_READ_BIT,
      .srcQueueFamilyIndex = srcQueueFamilyIndex,
      .dstQueueFamilyIndex = dstQueueFamilyIndex,
      .image = image_,
      .subresourceRange = {VK_IMAGE_ASPECT_COLOR_BIT,
                           0, mipLevels_, 0, 1},
  };
  VkDependencyInfo dependency_info{
      .sType = VK_STRUCTURE_TYPE_DEPENDENCY_INFO,
      .imageMemoryBarrierCount = 1,
      .pImageMemoryBarriers = &acquireBarrier,
  };
  vkCmdPipelineBarrier2(cmdBuffer, &dependency_info);
}

Besides the command buffer, this function requires the indices of the source and destination family queues. It also assumes a few things, such as the subresource range spanning the entire image.

Another method records the release barrier:

void Texture::addReleaseBarrier(
    VkCommandBuffer cmdBuffer,
    uint32_t srcQueueFamilyIndex,
    uint32_t dstQueueFamilyIndex) {
  VkImageMemoryBarrier2 releaseBarrier = {
      .sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER_2,
      .srcStageMask = VK_PIPELINE_STAGE_2_TRANSFER_BIT,
      .srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT,
      .dstAccessMask = VK_ACCESS_SHADER_READ_BIT,
      .srcQueueFamilyIndex = srcQueueFamilyIndex,
      .dstQueueFamilyIndex = dstQueueFamilyIndex,
      .image = image_,
      .subresourceRange = {VK_IMAGE_ASPECT_COLOR_BIT,
                           0, mipLevels_, 0, 1},
  };
  VkDependencyInfo dependency_info{
      .sType = VK_STRUCTURE_TYPE_DEPENDENCY_INFO,
      .imageMemoryBarrierCount = 1,
      .pImageMemoryBarriers = &releaseBarrier,
  };
  vkCmdPipelineBarrier2(cmdBuffer, &dependency_info);
}

This method makes the same assumptions as the previous one. The main differences are the source and destination stages and access masks.

To perform the upload and mipmap generation, we create two instances of VulkanCore::CommandQueueManager, one for the transfer queue and another for the graphics queue:

auto transferQueueMgr =
    context.createTransferCommandQueue(
        1, 1, "transfer queue");
auto graphicsQueueMgr =
    context.createGraphicsCommandQueue(
        1, 1, "graphics queue");

With valid VulkanCore::Context and VulkanCore::Texture instances in hand, we can upload the texture by retrieving a command buffer from the transfer family. We also create a staging buffer for transferring the texture data to device-local memory:

VulkanCore::Context context;  // Valid Context
std::shared_ptr<VulkanCore::Texture>
    texture;        // Valid Texture
void* textureData;  // Valid texture data
// Upload texture
auto textureUploadStagingBuffer =
    context.createStagingBuffer(
        texture->vkDeviceSize(),
        VK_BUFFER_USAGE_TRANSFER_SRC_BIT,
        "texture upload staging buffer");
const auto commandBuffer =
    transferQueueMgr.getCmdBufferToBegin();
texture->uploadOnly(commandBuffer,
                    textureUploadStagingBuffer.get(),
                    textureData);
texture->addReleaseBarrier(
    commandBuffer,
    transferQueueMgr.queueFamilyIndex(),
    graphicsQueueMgr.queueFamilyIndex());
transferQueueMgr.endCmdBuffer(commandBuffer);
transferQueueMgr.disposeWhenSubmitCompletes(
    std::move(textureUploadStagingBuffer));

For submitting the command buffer for processing, we create a semaphore to synchronize the upload command buffer and the one used for generating mipmaps:

VkSemaphore graphicsSemaphore;
const VkSemaphoreCreateInfo semaphoreInfo{
    .sType = VK_STRUCTURE_TYPE_SEMAPHORE_CREATE_INFO,
};
VK_CHECK(vkCreateSemaphore(context.device(),
                            &semaphoreInfo, nullptr,
                            &graphicsSemaphore));
VkPipelineStageFlags flags =
    VK_PIPELINE_STAGE_TRANSFER_BIT;
auto submitInfo =
    context.swapchain()->createSubmitInfo(
        &commandBuffer, &flags, false, false);
submitInfo.signalSemaphoreCount = 1;
submitInfo.pSignalSemaphores = &graphicsSemaphore;
transferQueueMgr.submit(&submitInfo);

The next step is to acquire a new command buffer from the graphics queue family for generating mipmaps. We also create an acquire barrier and reuse the semaphore from the previous command buffer submission:

// Generate mip levels
auto commandBuffer =
    graphicsQueueMgr.getCmdBufferToBegin();
texture->addAcquireBarrier(
    commandBuffer,
    transferCommandQueueMgr_.queueFamilyIndex(),
    graphicsQueueMgr.queueFamilyIndex());
texture->generateMips(commandBuffer);
graphicsQueueMgr.endCmdBuffer(commandBuffer);
VkPipelineStageFlags flags =
    VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
auto submitInfo =
    context_.swapchain()->createSubmitInfo(
        &commandBuffer, &flags, false, false);
submitInfo.pWaitSemaphores = &graphicsSemaphore;
submitInfo.waitSemaphoreCount = 1;

In this chapter, we have navigated the complex landscape of advanced Vulkan programming, building upon the foundational concepts introduced earlier. Our journey encompassed a diverse range of topics, each contributing crucial insights to the realm of high-performance graphics applications. From mastering Vulkan’s intricate memory model and efficient allocation techniques to harnessing the power of the VMA library, we’ve equipped ourselves with the tools to optimize memory management. We explored the creation and manipulation of buffers and images, uncovering strategies for seamless data uploads, staging buffers, and ring-buffer implementations that circumvent data races. The utilization of pipeline barriers to synchronize data access was demystified, while techniques for rendering pipelines, shader customization via specialization constants, and cutting-edge rendering methodologies such as PVP and MDI were embraced. Additionally, we ventured into dynamic rendering approaches without relying on render passes and addressed the intricacies of resource handling across multiple threads and queues. With these profound understandings, you are primed to create graphics applications that harmonize technical prowess with artistic vision using the Vulkan API.

The Modern Vulkan Cookbook: A practical guide to 3D graphics and advanced real-time rendering techniques in Vulkan

What do you get with eBook?

Product Details

Key benefits

Description

What you will learn

What do you get with eBook?

Product Details

Packt Subscriptions

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

Authors (2)

FAQs