The goal of this chapter is to show you how to render a scene that accepts input information, such as textures and uniform data, from the application side. This chapter will cover advanced topics in the Vulkan API that build upon the core concepts discussed in the previous chapter and present all the information you need to render complex scenes, along with newer features of the API. Additionally, the chapter will demonstrate techniques to enhance the rendering speed.
In this chapter, we’re going to cover the following recipes:
For this chapter, you will need to make sure you have VS 2022 installed along with the Vulkan SDK. Basic familiarity with the C++ programming language and an understanding of OpenGL or any other graphics API will be useful. Please revisit Chapter 1, Vulkan Core Concepts, under the Technical requirements section for details on setting up and building executables for this chapter. The recipe for this chapter can be run by launching Chapter02_MultiDrawIndirect.exe
executable.
Memory allocation and management are crucial in Vulkan, as almost none of the details of memory usage are managed by Vulkan. Except for deciding the exact memory address where memory should be allocated, all other details are the responsibility of the application. This means the programmer must manage memory types, their sizes, and alignments, as well as any sub-allocations. This approach gives applications more control over memory management and allows developers to optimize their programs for specific uses. This recipe will provide some fundamental information about the types of memory provided by the API as well as a summary of how to allocate and bind that memory to resources.
Graphics cards come in two variants, integrated and discrete. Integrated graphics cards share the same memory as the CPU, as shown in Figure 2.1:
Figure 2.1 – Typical memory architecture for discrete graphics cards
Discrete graphics cards have their own memory (device memory) separate from the main memory (host memory), as shown in Figure 2.2:
Figure 2.2 – Typical memory architecture for integrated graphics cards
Vulkan provides different types of memory:
Figure 2.3 summarizes the three aforementioned types of memory. Device-local memory is not visible from the host, while host-coherent and host-visible are. Copying data from the CPU to the GPU can be done using mapped memory for those two types of memory allocations. For device-local memory, it’s necessary to copy the data from the CPU to host-visible memory first using mapped memory (the staging buffer), and then perform a copy of the data from the staging buffer to the destination, the device-local memory, using a Vulkan function:
Figure 2.3 – Types of memory and their visibility from the application in Vulkan
Images are usually device-local memory, as they have their own layout that isn’t readily interpretable by the application. Buffers can be of any one of the aforementioned types.
A typical workflow for creating and uploading data to a buffer includes the following steps:
VkBuffer
by using the VkBufferCreateInfo
structure and calling vkCreateBuffer
.vkGetBufferMemoryRequirements
. The device may require a certain alignment, which could affect the necessary size of the allocation to accommodate the buffer’s contents.VkMemoryAllocateInfo
, specify the size of the allocation and the type of memory, and call vkAllocateMemory
.vkBindBufferMemory
to bind the allocation with the buffer object.vkMapMemory
, copy the data, and unmap the memory with vkUnmapMemory
.vkCmdCopyBuffer
function.As you can see, that’s a complex procedure that can be simplified by using the VMA library, an open source library that provides a convenient and efficient way to manage memory in Vulkan. It offers a high-level interface that abstracts the complex details of memory allocation, freeing you from the burden of manual memory management.
To use VMA, you first need to create an instance of the library and store a handle in a variable of type VmaAllocator
. To create one, you need a Vulkan physical device and a device.
Creating a VMA library instance requires instancing two different structures. One stores pointers to API functions that VMA needs to find other function pointers and another structure that provides a physical device, a device, and an instance for creating an allocator:
VkPhysicalDevice physicalDevice; // Valid Physical Device VkDevice device; // Valid Device VkInstance instance; // Valid Instance const uint32_t apiVersion = VK_API_VERSION_1_3; const VmaVulkanFunctions vulkanFunctions = { .vkGetInstanceProcAddr = vkGetInstanceProcAddr, .vkGetDeviceProcAddr = vkGetDeviceProcAddr, #if VMA_VULKAN_VERSION >= 1003000 .vkGetDeviceBufferMemoryRequirements = vkGetDeviceBufferMemoryRequirements, .vkGetDeviceImageMemoryRequirements = vkGetDeviceImageMemoryRequirements, #endif }; VmaAllocator allocator = nullptr; const VmaAllocatorCreateInfo allocInfo = { .physicalDevice = physicalDevice, .device = device, .pVulkanFunctions = &vulkanFunctions, .instance = instance, .vulkanApiVersion = apiVersion, }; vmaCreateAllocator(&allocInfo, &allocator);
The allocator needs pointers to a few Vulkan functions so that it can work based on the features you would like to use. In the preceding case, we provide only the bare minimum for allocating and deallocating memory. The allocator needs to be freed once the context is destroyed with vmaDestroyAllocator
.
A buffer in Vulkan is simply a contiguous block of memory that holds some data. The data can be vertex, index, uniform, and more. A buffer object is just metadata and does not directly contain data. The memory associated with a buffer is allocated after a buffer has been created.
Table 2.1 summarizes the most important usage types of buffers and their access type:
Buffer Type |
Access Type |
Uses |
Vertex or Index |
Read-only |
|
Uniform |
Read-only |
Uniform data storage |
Storage |
Read/write |
Generic data storage |
Uniform texel |
Read/write |
Data is interpreted as texels |
Storage texel |
Read/write |
Data is interpreted as texels |
Creating buffers is easy, but it helps to know what types of buffers exist and what their requirements are before setting out to create them. In this chapter, we will provide a template for creating buffers.
In the repository, Vulkan buffers are managed by the VulkanCore::Buffer
class, which provides functions to create and upload data to the device, as well as a utility function to use a staging buffer to upload data to device-only heaps.
Creating a buffer using VMA is simple:
0
for the flags is correct for most cases), the size of the buffer in bytes, its usage (this is how you define how the buffer will be used), and assign those values to an instance of the VkBufferCreateInfo
structure:VkDeviceSize size; // The requested size of the buffer VmaAllocator allocator; // valid VMA Allocator VkUsageBufferFlags use; // Transfer src/dst/uniform/SSBO VkBuffer buffer; // The created buffer VkBufferCreateInfo createInfo = { .sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO, .pNext = nullptr, .flags = {}, .size = size, .usage = use, .sharingMode = VK_SHARING_MODE_EXCLUSIVE, .queueFamilyIndexCount = {}, .pQueueFamilyIndices = {}, };
You will also need a set of VmaAllocationCreateFlagBits values:
const VmaAllocationCreateFlagBits allocCreateInfo = { VMA_ALLOCATION_CREATE_MAPPED_BIT, VMA_MEMORY_USAGE_CPU_ONLY, };
vmaCreateBuffer
to obtain the buffer handle and its allocation:VmaAllocation allocation; // Needs to live until the // buffer is destroyed VK_CHECK(vmaCreateBuffer(allocator, &createInfo, &allocCreateInfo, &buffer, &allocation, nullptr));
VmaAllocationInfo allocationInfo; vmaGetAllocationInfo(allocator, allocation, &allocationInfo);
Some creation flags affect how the buffer can be used, so you might need to make adjustments to the preceding code depending on how you intend to use the buffers you create in your application.
Uploading data from the application to the GPU depends on the type of buffer. For host-visible buffers, it’s a direct copy using memcpy
. For device-local buffers, we need a staging buffer, which is a buffer that is visible both by the CPU and the GPU. In this recipe, we will demonstrate how to upload data from your application to the device-visible memory (into a buffer’s memory region on the device).
If you haven’t already, please refer to the Understanding Vulkan’s memory model recipe.
The upload process depends on the type of buffer:
vmaMapMemory
and copy the data using memcpy
. The operation is synchronous, so the mapped pointer can be unmapped as soon as memcpy
returns.It’s fine to map a host-visible buffer as soon as it is created and leave it mapped until its destruction. That is the recommended approach, as you don’t incur the overhead of mapping the memory every time it needs to be updated:
VmaAllocator allocator; // Valid VMA allocator VmaAllocation allocation; // Valid VMA allocation void *data; // Data to be uploaded size_t size; // Size of data in bytes void *map = nullptr; VK_CHECK(vmaMapMemory(allocator, allocation, &map)); memcpy(map, data, size); vmaUnmapMemory(allocator_, allocation_); VK_CHECK(vmaFlushAllocation(allocator_, allocation_, offset, size));
vkCmdCopyBuffer
, as depicted in Figure 2.4. Note that this requires a command buffer:Figure 2.4 – Staging buffers
VkDeviceSize srcOffset; VkDeviceSize dstOffset; VkDeviceSize size; VkCommandBuffer commandBuffer; // Valid Command Buffer VkBuffer stagingBuffer; // Valid host-visible buffer VkBuffer buffer; // Valid device-local buffer VkBufferCopy region(srcOffset, dstOffset, size); vkCmdCopyBuffer(commandBuffer, stagingBuffer, buffer, 1, ®ion);
Uploading data from your application to a buffer is accomplished either by a direct memcpy
operation or by means of a staging buffer. We showed how to perform both uploads in this recipe.
Creating a staging buffer is like creating a regular buffer but requires flags that specify that the buffer is host-visible. In this recipe, we will show how to create a buffer that can be used as a staging buffer – one that can be used as an intermediary destination of the data being uploaded from your application on its way to a device-local memory.
The Creating buffers recipe explains how to create buffers in general, while this recipe shows which flags and parameters you need to create a staging buffer.
VkBufferCreateInfo::usage
needs to contain VK_BUFFER_USAGE_TRANSFER_SRC_BIT
as it will be the source operation for a vkCmdCopyBuffer
command:
const VkBufferCreateInfo stagingBufferInfo = { .sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO, .size = size, .usage = VK_BUFFER_USAGE_TRANSFER_SRC_BIT, }; const VmaAllocationCreateInfo stagingAllocationCreateInfo = { .flags = VMA_ALLOCATION_CREATE_HOST_ACCESS_SEQUENTIAL_WRITE_BIT | VMA_ALLOCATION_CREATE_MAPPED_BIT, .usage = VMA_MEMORY_USAGE_CPU_ONLY, }; const VmaAllocationCreateFlagBits allocCreateInfo = { VMA_ALLOCATION_CREATE_MAPPED_BIT, VMA_MEMORY_USAGE_CPU_ONLY, }; VmaAllocation allocation; // Needs to live until the // buffer is destroyed VK_CHECK(vmaCreateBuffer(allocator, &stagingBufferInfo, &allocCreateInfo, &buffer, &allocation, nullptr));
A staging buffer may be better implemented using a wrapper in your application. A wrapper can increase or decrease the size of the buffer as needed, for example. One staging buffer may be enough for your application, but you need to watch the requirements imposed by some architectures.
When a buffer needs to be updated every frame, we run the risk of creating a data race, as shown in Figure 2.5. A data race is a situation where multiple threads within a program concurrently access a shared data point, with at least one thread performing a write operation. This concurrent access can result in unforeseen behavior due to the unpredictable order of operations. Take the example of a uniform buffer that stores the view, model, and viewport matrices and needs to be updated every frame. The buffer is updated while the first command buffer is being recorded, initializing it (version 1). Once the command buffer starts processing on the GPU, the buffer contains the correct data:
Figure 2.5 – Data race when using one buffer
After the first command buffer starts processing in the GPU, the application may try to update the buffer’s contents to version 2 while the GPU is accessing that data for rendering!
Synchronization is by far the hardest aspect of Vulkan. If synchronization elements such as semaphores, fences, and barriers are used too greedily, then your application becomes a series and won’t use the full power of the parallelism between the CPU and the GPU.
Make sure you also read the Understanding synchronization in the swapchain – fences and semaphores recipe in Chapter 1, Vulkan Core Concepts. That recipe and this one only scratch the surface of how to tackle synchronization, but are very good starting points.
A ring-buffer implementation is provided in the EngineCore::RingBuffer
repository, which has a configurable number of sub-buffers. Its sub-buffers are all host-visible, persistent buffers; that is, they are persistently mapped after creation for ease of access.
There are a few ways to avoid this problem, but the easiest one is to create a ring buffer that contains several buffers (or any other resource) equal to the number of frames in flight. Figure 2.6 shows events when there are two buffers available. Once the first command buffer is submitted and is being processed in the GPU, the application is free to process copy 1 of the buffer, as it’s not being accessed by the device:
Figure 2.6 – A data race is avoided with multiple copies of a resource
Even though this is a simple solution, it has a caveat: if partial updates are allowed, care must be taken when the buffer is updated. Consider Figure 2.7, in which a ring buffer that contains three sub-allocations is partially updated. The buffer stores the view, model, and viewport matrices. During initialization, all three sub-allocations are initialized to three identity matrices. On Frame 0, while Buffer 0 is active, the model matrix is updated and now contains a translation of (10, 10, 0)
. On the next frame, Frame 1, Buffer 1 becomes active, and the viewport matrix is updated. Because Buffer 1 was initialized to three identity matrices, updating only the viewport matrix makes buffers 0 and 1 out of sync (as well as Buffer 3). To guarantee that partial updates work, we need to copy the last active buffer, Buffer 0, into Buffer 1 first, and then update the viewport matrix:
Figure 2.7 – Partial update of a ring buffer makes all sub-allocations out of sync if they are not replicated
Synchronization is a delicate topic, and guaranteeing your application behaves correctly with so many moving parts is tricky. Hopefully, a ring-buffer implementation that is simple may help you focus on other areas of the code.
In Vulkan, commands may be reordered when a command buffer is being processed, subject to certain restrictions. This is known as command buffer reordering, and it can help to improve performance by allowing the driver to optimize the order in which commands are executed.
The good news is that Vulkan provides a mechanism called pipeline barriers to ensure that dependent commands are executed in the correct order. They are used to explicitly specify dependencies between commands, preventing them from being reordered, and at what stages they might overlap. This recipe will explain what pipeline barriers are and what their properties mean. It will also show you how to create and install pipeline barriers.
Consider two draw calls issued in sequence. The first one writes to a color attachment, while the second draw call samples from that attachment in the fragment shader:
vkCmdDraw(...); // draws into color attachment 0 vkCmdDraw(...); // reads from color attachment 0
Figure 2.8 helps visualize how those two commands may be processed by the device. In the diagram, commands are processed from top to bottom and progress on the pipeline from left to right. Clock cycles are a loose term, because processing may take multiple clock cycles, but are used to indicate that – in general – some tasks must happen after others.
In the example, the second vkCmdDraw
call starts executing at C2, after the first draw call. This offset is not enough, as the second draw call needs to read the color attachment at the Fragment Shader stage, which is not produced by the first draw call until it reaches the Color Attach Output stage. Without synchronization, this setup may cause data races:
Figure 2.8 – Two consecutive commands recorded on the same command buffer being processed without synchronization
A pipeline barrier is a feature that is recorded into the command buffer and that specifies the pipeline stages that need to have been completed for all commands that appear before the barrier and before the command buffer continues processing. Commands recorded before the barrier are said to be in the first synchronization scope or first scope. Commands recorded after the barrier are said to be part of the second synchronization scope or second scope.
The barrier also allows fine-grained control to specify at which stage commands after the barrier must wait until commands in the first scope finish processing. That’s because commands in the second scope don’t need to wait until commands in the first scope are done. They can start processing as soon as possible, as long as the conditions specified in the barrier are met.
In the example in Figure 2.8, the first draw call, in the first scope, needs to write to the attachment before the second draw call can access it. The second draw call does not need to wait until the first draw call finishes processing the Color Attach Output stage. It can start right away, as long as its fragment stage happens after the first draw call is done with its Color Attach Output stage, as shown in Figure 2.9:
Figure 2.9 – Two consecutive commands recorded on the same command buffer being processed with synchronization
There are three types of barriers:
VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL
layout, as it will be read from, while the next mip level needs to be in the VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL
layout, as it will be written to.Pipeline barriers are recorded with the vkCmdPipelineBarrier
command, in which you can provide several barriers of multiple types at the same time. The following code snippet shows how to create a barrier used to create a dependency between the two draw calls in Figure 2.9:
VkCommandBuffer commandBuffer; // Valid Command Buffer VkImage image; // Valid image const VkImageSubresourceRange subresource = { .aspectMask =.baseMipLevel = 0, .levelCount = VK_REMAINING_MIP_LEVELS, .baseArrayLayer = 0, .layerCount = 1, }; const VkImageMemoryBarrier imageBarrier = { .sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER, .srcAccessMask = VK_ACCESS_2_COLOR_ATTACHMENT_WRITE_BIT_KHR, .dstAccessMask = VK_ACCESS_2_SHADER_READ_BIT_KHR, .oldLayout = VK_IMAGE_LAYOUT_ATTACHMENT_OPTIMAL, .newLayout = VK_IMAGE_LAYOUT_READ_ONLY_OPTIMAL, .srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED, .dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED, .image = image, .subresourceRange = &subresource, }; vkCmdPipelineBarrier( commandBuffer, VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, 0, 0, nullptr, 0, nullptr, 1, &memoryBarrier);
The barrier needs to be recorded between the two draw calls:
vkCmdDraw(...); // draws into color attachment 0 vkCmdPipelineBarrier(...); vkCmdDraw(...); // reads from color attachment 0
Pipeline barriers are tricky but absolutely fundamental in Vulkan. Make sure you understand what they offer and how they operate before continuing to read the other recipes.
Images are used for storing 1D, 2D, or 3D data, although they are mostly used for 2D data. Different than buffers, images have the advantage of being optimized for locality in memory layout. This is because most GPUs have a fixed-function texture unit or sampler that reads texel data from an image and applies filtering and other operations to produce a final color value. Images can have different formats, such as RGB, RGBA, BGRA, and so on.
An image object is only metadata in Vulkan. Its data is stored separately and is created in a similar manner to buffers (Figure 2.10):
Figure 2.10 – Images
Images in Vulkan cannot be accessed directly and need to be accessed only by means of an image view. An image view is a way to access a subset of the image data by specifying the subresource range, which includes the aspect (such as color or depth), the mip level, and the array layer range.
Another very important aspect of images is their layout. It is used to specify the intended usage of an image resource in Vulkan, such as whether it should be used as a source or destination for a transfer operation, a color or depth attachment for rendering, or as a shader read or write resource. The correct image layout is important because it ensures that the GPU can efficiently access and manipulate the image data in accordance with the intended usage. Using the wrong image layout can lead to performance issues or rendering artifacts and can result in undefined behavior. Therefore, it’s essential to correctly specify the image layout for each usage of an image in a Vulkan application. Common image layouts are undefined (VK_IMAGE_LAYOUT_UNDEFINED
) color attachment (VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL
), depth/stencil attachment (VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL
), and shader read(VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL
). Image layout transitions are done as part of the vkCmdPipelineBarrier
command.
In this recipe, you will learn how to create images on a device.
In the VulkanCore::Texture
class within our repository, we’ve encapsulated the intricate management of images and image views, offering a comprehensive solution for handling Vulkan textures. From facilitating efficient data uploads to handling transitions between image layouts and generating mipmaps, the Texture
class equips us with the means to seamlessly integrate textures in the Vulkan examples.
Creating an image requires some basic information about it, such as type (1D, 2D, 3D), size, format (RGBA, BGRA, and so on), number of mip levels, number of layers (faces for cubemaps), and a few others:
VkFormat format; // Image format VkExtents extents; // Image size uint32_t mipLevels; // Number of mip levels uint32_t layerCount; // Number of layers (sides of cubemap) const VkImageCreateInfo imageInfo = { .sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO, .flags = 0, // optional .imageType = VK_IMAGE_TYPE_2D, // 1D, 2D, 3D .format = format, .extent = extents, .mipLevels = mipLevels, .arrayLayers = layerCount, .samples = VK_SAMPLE_COUNT_1_BIT, .tiling = VK_IMAGE_TILING_OPTIMAL, .usage = VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT, .sharingMode = VK_SHARING_MODE_EXCLUSIVE, .initialLayout = VK_IMAGE_LAYOUT_UNDEFINED, };
The following structure tells VMA that the image will be a device-only image:
const VmaAllocationCreateInfo allocCreateInfo = { .flags = VMA_ALLOCATION_CREATE_DEDICATED_MEMORY_BIT, .usage = VMA_MEMORY_USAGE_AUTO_PREFER_DEVICE, .priority = 1.0f, };
The resulting image’s handle will be stored in image
:
VkImage image = VK_NULL_HANDLE; VK_CHECK(vmaCreateImage(vmaAllocator_, &imageInfo, &allocCreateInfo, &image, &vmaAllocation_, nullptr));
The next step is optional but useful for debugging or optimizing the code:
VmaAllocationInfo allocationInfo; vmaGetAllocationInfo(vmaAllocator_, vmaAllocation_, &allocationInfo);
This recipe only showed you how to create an image in Vulkan, not how to upload data to it. Uploading data to an image is just like uploading data to a buffer.
Image views provide a way to interpret images in terms of size, location, and format, except in terms of their layout, which needs to be transformed explicitly and transitioned using image barriers. In this recipe, you will learn how to create an image view object in Vulkan.
Image views are stored and managed by the VulkanCore::Texture
class in the repository.
Creating an image view is easy; all you need is the handle of the image it is associated with and the region of the image that you would like to represent:
VkDevice device; // Valid Vulkan Device VkImage image; // Valid Image object VkFormat format; uint32_t numMipLevels; // Number of mip levels uint32_t layers; // Number of layers (cubemap faces) const VkImageViewCreateInfo imageViewInfo = { .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO, .image = image, .viewType = VK_IMAGE_VIEW_TYPE_2D, // 1D, 2D, 3D, Cubemap // and arrays .format = format, .components = { .r = VK_COMPONENT_SWIZZLE_IDENTITY, .g = VK_COMPONENT_SWIZZLE_IDENTITY, .b = VK_COMPONENT_SWIZZLE_IDENTITY, .a = VK_COMPONENT_SWIZZLE_IDENTITY, }, .subresourceRange = { .aspectMask = VK_IMAGE_ASPECT_COLOR_BIT, .baseMipLevel = 0, .levelCount = numMipLevels, .baseArrayLayer = 0, .layerCount = layers, }}; VkImageView imageView{VK_NULL_HANDLE}; VK_CHECK(vkCreateImageView(device, &imageViewInfo, nullptr, &imageView));
Without an image view, a texture cannot be used by shaders. Even when used as color attachments, images need image views.
A sampler in Vulkan transcends a simple object; it’s a crucial bridge between shader execution and image data. Beyond interpolation, it governs filtering, addressing modes, and mipmapping. Filters dictate interpolation between texels, while addressing modes control how coordinates map to image extents. Anisotropic filtering further enhances sampling fidelity. Mipmapping, a pyramid of downsampled image levels, is another facet managed by samplers. In essence, creating a sampler involves orchestrating these attributes to seamlessly harmonize image data and shader intricacies. In this recipe, you will learn how to create a sampler object in Vulkan.
Samplers are implemented by the VulkanCore::Sampler
class in the repository.
The properties of a sampler define how an image is interpreted in the pipeline, usually in a shader. The process is simple – instantiate a VkSamplerCreateInfo
structure and call vkCreateSampler
:
VkDevice device; // Valid Vulkan Device VkFilter minFilter; VkFilter maxFilter; float maxLod; // Max mip level const VkSamplerCreateInfo samplerInfo = { .sType = VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO, .magFilter = minFilter, .minFilter = magFilter, .mipmapMode = maxLod > 0 ? VK_SAMPLER_MIPMAP_MODE_LINEAR : VK_SAMPLER_MIPMAP_MODE_NEAREST, .addressModeU = VK_SAMPLER_ADDRESS_MODE_REPEAT, .addressModeV = VK_SAMPLER_ADDRESS_MODE_REPEAT, .addressModeW = VK_SAMPLER_ADDRESS_MODE_REPEAT, .mipLodBias = 0, .anisotropyEnable = VK_FALSE, .minLod = 0, .maxLod = maxLod, }; VkSampler sampler{VK_NULL_HANDLE}; VK_CHECK(vkCreateSampler(device, &samplerInfo, nullptr, &sampler));
A sampler is one of the simplest objects to create in Vulkan and one of the easiest to understand, as it describes very common computer graphics concepts.
Providing data from your application that will be used in shaders is one of the most convoluted aspects of Vulkan and requires several steps that need to be accomplished in the right order (and with the right parameters). In this recipe, with many smaller recipes, you will learn how to provide data used in shaders, such as textures, buffers, and samplers.
Resources consumed by shaders are specified using the layout
keyword, along with set
and binding
qualifiers:
layout(set = 0, binding=0) uniform Transforms { mat4 model; mat4 view; mat4 projection; } MVP;
Each resource is represented by a binding. A set is a collection of bindings. One binding doesn’t necessarily represent just one resource; it can also represent an array of resources of the same type.
Providing a resource as input to shaders is a multi-step process that involves the following:
vkAllocateDescriptorSets
.vkUpdateDescriptorSets
. In this step, we associate a real resource (a buffer, a texture, and so on) with a binding.vkCmdBindDescriptorSet
. This step makes resources bound to their set/bindings in the previous step available to shaders in the current pipeline.The next recipes will show you how to perform each one of those steps.
Consider the following GLSL code, which specifies several resources:
struct Vertex { vec3 pos; vec2 uv; vec3 normal; }; layout(set = 0, binding=0) uniform Transforms { mat4 model; mat4 view; mat4 projection; } MVP; layout(set = 1, binding = 0) uniform texture2D textures[]; layout(set = 1, binding = 1) uniform sampler samplers[]; layout(set = 2, binding = 0) readonly buffer VertexBuffer { Vertex vertices[]; } vertexBuffer;
The code requires three sets (0, 1, and 2), so we need to create three descriptor set layouts. In this recipe, you will learn how to create a descriptor set layout for the preceding code.
Descriptor sets and bindings are created, stored, and managed by the VulkanCore::Pipeline
class in the repository. A descriptor set in Vulkan acts as a container that holds resources, such as buffers, textures, and samplers, for use by shaders. Binding refers to the process of associating these descriptor sets with specific shader stages, enabling seamless interaction between shaders and resources during rendering. These descriptor sets serve as gateways through which resources are seamlessly bound to shader stages, orchestrating harmony between data and shader execution. To facilitate this synergy, the class simplifies descriptor set creation and management, complemented by methods for efficient resource binding within the Vulkan rendering pipeline.
A descriptor set layout states its bindings (number and types) with the vkDescriptorSetLayout
structure. Each binding is described using an instance of the vkDescriptorSetLayoutBinding
structure. The relationship between the Vulkan structures needed to create a descriptor set layout for the preceding code is shown in Figure 2.11:
Figure 2.11 – Illustrating the configuration of descriptor set layouts for GLSL shaders
The following code shows how to specify two bindings for set 1, which are stored in a vector of bindings:
constexpr uint32_t kMaxBindings = 1000; const VkDescriptorSetLayoutBinding texBinding = { .binding = 0, .descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE, .descriptorCount = kMaxBindings, .stageFlags = VK_SHADER_STAGE_VERTEX_BIT, }; const VkDescriptorSetLayoutBinding samplerBinding = { .binding = 1, .descriptorType = VK_DESCRIPTOR_TYPE_SAMPLER, .descriptorCount = kMaxBindings, .stageFlags = VK_SHADER_STAGE_VERTEX_BIT, }; struct SetDescriptor { uint32_t set_; std::vector<VkDescriptorSetLayoutBinding> bindings_; }; std::vector<SetDescriptor> sets(1); sets[0].set_ = 1; sets[0].bindings_.push_back(texBinding); sets[0].bindings_.push_back(samplerBinding);
Since each binding describes a vector, and the VkDescriptorSetLayoutBinding
structure requires the number of descriptors, we are using a large number that hopefully will accommodate all elements we need in the array. The vector of bindings is stored in a structure that describes a set with its number and all its bindings. This vector will be used to create a descriptor set layout:
constexpr VkDescriptorBindingFlags flagsToEnable = VK_DESCRIPTOR_BINDING_PARTIALLY_BOUND_BIT | VK_DESCRIPTOR_BINDING_UPDATE_UNUSED_WHILE_PENDING_BIT; for (size_t setIndex = 0; const auto& set : sets) { std::vector<VkDescriptorBindingFlags> bindFlags( set.bindings_.size(), flagsToEnable); const VkDescriptorSetLayoutBindingFlagsCreateInfo extendedInfo{ .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_BINDING_FLAGS_CREATE_INFO, .pNext = nullptr, .bindingCount = static_cast<uint32_t>( set.bindings_.size()), .pBindingFlags = bindFlags.data(), }; const VkDescriptorSetLayoutCreateInfo dslci = { .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO, .pNext = &extendedInfo, .flags = VK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT_EXT, .bindingCount = static_cast<uint32_t>(set.bindings_.size()), .pBindings = set.bindings_.data(), }; VkDescriptorSetLayout descSetLayout{VK_NULL_HANDLE}; VK_CHECK(vkCreateDescriptorSetLayout( context_->device(), &dslci, nullptr, &descSetLayout)); }
Each set requires its own descriptor set layout, and the preceding process needs to be repeated for each one. The descriptor set layout needs to be stored so that it can be referred to in the future.
Push constants are another way to pass data to shaders. Although a very performant and easy way to do so, push constants are very limited in size, 128 bytes being the only guaranteed amount by the Vulkan specification.
This recipe will show you how to pass a small amount of data from your application to shaders, using push constants for a simple shader.
Push constants are stored and managed by the VulkanCore::Pipeline
class.
Push constants are recorded directly onto the command buffer and aren’t prone to the same synchronization issues that exist with other resources. They are declared in the shader as follows, with one maximum block per shader:
layout (push_constant) uniform Transforms { mat4 model; } PushConstants;
The pushed data must be split into the shader stages. Parts of it can be assigned to different shader stages or assigned to one single stage. The important part is that the data cannot be greater than the total amount available for push constants. The limit is provided in VkPhysicalDeviceLimits::maxPushConstantsSize
.
Before using push constants, we need to specify how many bytes we are using in each shader stage:
const VkPushConstantRange range = { .stageFlags = VK_SHADER_STAGE_VERTEX_BIT, .offset = 0, .size = 64, }; std::vector<VkPushConstantRange> pushConsts; pushConsts.push_back(range);
The code states that the first (offset == 0
) 64
bytes of the push constant data recorded in the command buffer (the size of a 4x4 matrix of floats) will be used by the vertex shader. This structure will be used in the next recipe to create a pipeline layout object.
A pipeline layout is an object in Vulkan that needs to be created and destroyed by the application. The layout is specified using structures that define the layout of bindings and sets. In this recipe, you will learn how to create a pipeline layout.
A VkPipelineLayoutCreateInfo
instance is created automatically by the VulkanCore::Pipeline
class in the repository based on information provided by the application using a vector of VulkanCore::Pipeline::SetDescriptor
structures.
With all descriptor set layouts for all sets and the push constant information in hand, the next step consists of creating a pipeline layout:
std::vector<VkDescriptoSetLayout> descLayouts; const VkPipelineLayoutCreateInfo pipelineLayoutInfo = { .sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO, .setLayoutCount = (uint32_t)descLayouts.size(), .pSetLayouts = descLayouts.data(), .pushConstantRangeCount = !pushConsts.empty() ? static_cast<uint32_t>(pushConsts.size()) : 0, .pPushConstantRanges = !pushConsts.empty() ? pushConsts.data() : nullptr, }; VkPipelineLayout pipelineLayout{VK_NULL_HANDLE}; VK_CHECK(vkCreatePipelineLayout(context_->device(), &pipelineLayoutInfo, nullptr, &pipelineLayout));
Once you have the descriptor set layout in hand and know how to use the push constants in your application, creating a pipeline layout is straightforward.
A descriptor pool contains a maximum number of descriptors it can provide (be allocated from), grouped by binding type. For instance, if two bindings of the same set require one image each, the descriptor pool would have to provide at least two descriptors. In this recipe, you will learn how to create a descriptor pool.
Descriptor pools are allocated in the VulkanCore::Pipeline::
initDescriptorPool()
method.
Creating a descriptor pool is straightforward. All we need is a list of binding types and the maximum number of resources we’ll allocate for each one:
constexpr uint32_t swapchainImages = 3; std::vector<VkDescriptorPoolSize> poolSizes; poolSizes.emplace_back(VkDescriptorPoolSize{ VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE, swapchainImages* kMaxBindings}); poolSizes.emplace_back(VkDescriptorPoolSize{ VK_DESCRIPTOR_TYPE_SAMPLER, swapchainImages* kMaxBindings});
Since we duplicate the resources based on the number of swapchain images to avoid data races between the CPU and the GPU, we multiply the number of bindings we requested before (kMaxBindings = 1000
) by the number of swapchain images:
const VkDescriptorPoolCreateInfo descriptorPoolInfo = { .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO, .flags = VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT | VK_DESCRIPTOR_POOL_CREATE_UPDATE_AFTER_BIND_BIT, .maxSets = MAX_DESCRIPTOR_SETS, .poolSizeCount = static_cast<uint32_t>(poolSizes.size()), .pPoolSizes = poolSizes.data(), }; VkDescriptorPool descriptorPool{VK_NULL_HANDLE}; VK_CHECK(vkCreateDescriptorPool(context_->device(), &descriptorPoolInfo, nullptr, &descriptorPool));
Be careful not to create pools that are too large. Achieving a high-performing application means not allocating more resources than you need.
Once a descriptor layout and a descriptor pool have been created, before you can use it, you need to allocate a descriptor set, which is an instance of a set with the layout described by the descriptor layout. In this recipe, you will learn how to allocate a descriptor set.
Descriptor set allocations are done in the VulkanCore::Pipeline:: allocateDescriptors()
method. Here, developers define the count of descriptor sets required, coupled with binding counts per set. The subsequent bindDescriptorSets()
method weaves the descriptors into command buffers, preparing them for shader execution.
Allocating a descriptor set (or a number of them) is easy. You need to fill the VkDescriptorSetAllocateInfo
structure and call vkAllocateDescriptorSets
:
VkDescriptorSetAllocateInfo allocInfo = { .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO, .descriptorPool = descriptorPool, .descriptorSetCount = 1, .pSetLayouts = &descSetLayout, }; VkDescriptorSet descriptorSet{VK_NULL_HANDLE}; VK_CHECK(vkAllocateDescriptorSets(context_->device(), &allocInfo, &descriptorSet));
When using multiple copies of a resource to avoid race conditions, there are two approaches:
Once a descriptor set has been allocated, it is not associated with any resources. This association must happen once (if your descriptor sets are immutable) or every time you need to bind a different resource to a descriptor set. In this recipe, you will learn how to update descriptor sets during rendering and after you have set up the pipeline and its layout.
In the repository, VulkanCore::Pipeline
provides methods to update different types of resources, as each binding can only be associated with one type of resource (image, sampler, or buffer): updateSamplersDescriptorSets()
, updateTexturesDescriptorSets()
, and updateBuffersDescriptorSets
()
.
Associating a resource with a descriptor set is done with the vkUpdateDescriptorSets
function. Each call to vkUpdateDescriptorSets
can update one or more bindings of one or more sets. Before updating a descriptor set, let’s look at how to update one binding.
You can associate either a texture, a texture array, a sampler, a sampler array, a buffer, or a buffer array with one binding. To associate images or samplers, use the VkDescriptorImageInfo
structure. To associate buffers, use the VkDescriptorBufferInfo
structure. Once one or more of those structures have been instantiated, use the VkWriteDescriptorSet
structure to bind them all with a binding. Bindings that represent an array are updated with a vector of VkDescriptor*Info
.
layout(set = 1, binding = 0) uniform texture2D textures[]; layout(set = 1, binding = 1) uniform sampler samplers[]; layout(set = 2, binding = 0) readonly buffer VertexBuffer { Vertex vertices[]; } vertexBuffer;
textures[]
array, we need to create two instances of VkDescriptorImageInfo
and record them in the first VkWriteDescriptorSet
structure:VkImageView imageViews[2]; // Valid Image View objects VkDescriptorImageInfo texInfos[] = { VkDescriptorImageInfo{ .imageView = imageViews[0], .imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL, }, VkDescriptorImageInfo{ .imageView = imageViews[1], .imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL, }, }; const VkWriteDescriptorSet texWriteDescSet = { .sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET, .dstSet = 1, ee, .dstArrayElement = 0, .descriptorCount = 2, .descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE, .pImageInfo = &texInfos, .pBufferInfo = nullptr, };
.dstSet = 1
) and binding 0 (.dstBinding = 0
) as elements 0 and 1 of the array. If you need to bind more objects to the array, all you need are more instances of VkDescriptorImageInfo
. The number of objects bound to the current binding is specified by the descriptorCount
member of the structure.The process is similar for sampler objects:
VkSampler sampler[2]; // Valid Sampler object VkDescriptorImageInfo samplerInfos[] = { VkDescriptorImageInfo{ .sampler = sampler[0], }, VkDescriptorImageInfo{ .sampler = sampler[1], }, }; const VkWriteDescriptorSet samplerWriteDescSet = { .sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET, .dstSet = 1, .dstBinding = 1, .dstArrayElement = 0, .descriptorCount = 2, .descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE, .pImageInfo = &samplerInfos, .pBufferInfo = nullptr, };
This time, we are binding the sampler objects to set 1, binding 1. Buffers are bound using the VkDescriptorBufferInfo
structure:
VkBuffer buffer; // Valid Buffer object VkDeviceSize bufferLength; // Range of the buffer const VkDescriptorBufferInfo bufferInfo = { .buffer = buffer, .offset = 0, .range = bufferLength, }; const VkWriteDescriptorSet bufferWriteDescSet = { .sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET, .dstSet = 2, .dstBinding = 0, .dstArrayElement = 0, .descriptorCount = 1, .descriptorType = VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE, .pImageInfo = nullptr, .pBufferInfo = &bufferInfo, };
Besides storing the address of the bufferInfo
variable to the .pBufferInfo
member of VkWriteDescriptorSet
, we are binding one buffer (.descriptorCount = 1
) to set 2 (.dstSet = 2
) and binding 0
(.dstBinding =
0
).
VkWriteDescriptorSet
instances in a vector and calling vkUpdateDescriptorSets
:VkDevice device; // Valid Vulkan Device std::vector<VkWriteDescriptorSet> writeDescSets; writeDescSets.push_back(texWriteDescSet); writeDescSets.push_back(samplerWriteDescSet); writeDescSets.push_back(bufferWriteDescSet); vkUpdateDescriptorSets(device, static_cast<uint32_t>(writeDescSets.size()), writeDescSets.data(), 0, nullptr);
Encapsulating this task is the best way to avoid repetition and bugs introduced by forgetting a step in the update procedure.
While rendering, we need to bind the descriptor sets we’d like to use during a draw call.
Binding sets is done with the VulkanCore::Pipeline::
bindDescriptorSets()
method.
To bind a descriptor set for rendering, we need to call vkCmdBindDescriptorSets
:
VkCommandBuffer commandBuffer; // Valid Command Buffer VkPipelineLayout pipelineLayout; // Valid Pipeline layout uint32_t set; // Set number VkDescriptorSet descSet; // Valid Descriptor Set vkCmdBindDescriptorSets( commandBuffer, VK_PIPELINE_BIND_POINT_GRAPHICS, pipelineLayout, set, 1u, &descSet, 0, nullptr);
Now that we’ve successfully bound a descriptor set for rendering, let’s turn our attention to another crucial aspect of our graphics pipeline: updating push constants.
Push constants are updated during rendering by recording their values directly into the command buffer being recorded.
Updating push constants is done with the VulkanCore::Pipeline::
udpatePushConstants()
method.
Once rendered, updating push constants is straightforward. All you need to do is call vkCmdPushConstants
:
VkCommandBuffer commandBuffer; // Valid Command Buffer VkPipelineLayout pipelineLayout; // Valid Pipeline Layout glm::vec4 mat; // Valid matrix vkCmdPushConstants(commandBuffer, pipelineLayout, VK_SHADER_STAGE_FRAGMENT_BIT, 0, sizeof(glm::vec4), &mat);
This call records the contents of mat
into the command buffer, starting at offset 0 and signaling that this data will be used by the vertex shader.
The process of compiling shader code results in immutability once completed. The compilation procedure carries a substantial time overhead and is generally circumvented during runtime. Even minor adjustments to a shader necessitate recompilation, leading to the creation of a fresh shader module and potentially a new pipeline as well – all entailing significant resource-intensive operations.
In Vulkan, specialization constants allow you to specify constant values for shader parameters at pipeline creation time, instead of having to recompile the shader with new values every time you want to change them. This can be particularly useful when you want to reuse the same shader with different constant values multiple times. In this recipe, we will delve deeper into the practical application of specialization constants in Vulkan to create more efficient and flexible shader programs, allowing you to adjust without the need for resource-intensive recompilations.
Specialization constants are available in the repository through the VulkanCore::Pipeline::GraphicsPipelineDescriptor
structure. You need to provide a vector of VkSpecializationMapEntry
structures for each shader type you’d like to apply specialization constants to.
Specialization constants are declared in GLSL using the constant_id
qualifier along with an integer that specifies the constant’s ID:
layout (constant_id = 0) const bool useShaderDebug = false;
To create a pipeline with specialized constant values, you first need to create a VkSpecializationInfo
structure that specifies the constant values and their IDs. You then pass this structure to the VkPipelineShaderStageCreateInfo
structure when creating a pipeline:
const bool kUseShaderDebug = false; const VkSpecializationMapEntry useShaderDebug = { .constantID = 0, // matches the constant_id qualifier .offset = 0, .size = sizeof(bool), }; const VkSpecializationInfo vertexSpecializationInfo = { .mapEntryCount = 1, .pMapEntries = &useShaderDebug, .dataSize = sizeof(bool), .pData = &kUseShaderDebug, }; const VkPipelineShaderStageCreateInfo shaderStageInfo = { ... .pSpecializationInfo = &vertexSpecializationInfo, };
Because specialization constants are real constants, branches that depend on them may be entirely removed during the final compilation of the shader. On the other hand, specialization constants should not be used to control parameters such as uniforms, as they are not as flexible and require to be known during the construction of the pipeline.
MDI and PVP are features of modern graphics APIs that allow for greater flexibility and efficiency in vertex processing.
MDI allows issuing multiple draw calls with a single command, each of which derives its parameters from a buffer stored in the device (hence the indirect term). This is particularly useful because those parameters can be modified in the GPU itself.
With PVP, each shader instance retrieves its vertex data based on its index and instance IDs instead of being initialized with the vertex’s attributes. This allows for flexibility because the vertex attributes and their format are not baked into the pipeline and can be changed solely based on the shader code.
In the first sub-recipe, we will focus on the implementation of MDI, demonstrating how this powerful tool can streamline your graphics operations by allowing multiple draw calls to be issued from a single command, with parameters that can be modified directly in the GPU. In the following sub-recipe, we will guide you through the process of setting up PVP, highlighting how the flexibility of this feature can enhance your shader code by enabling changes to vertex attributes without modifying the pipeline.
For using MDI, we store all mesh data belonging to the scene in one big buffer for all the meshes’ vertices and another one for the meshes’ indices, with the data for each mesh stored sequentially, as depicted in Figure 2.12.
The drawing parameters are stored in an extra buffer. They must be stored sequentially, one for each mesh, although they don’t have to be provided in the same order as the meshes:
Figure 2.12 – MDI data layout
We will now learn how to implement MDI using the Vulkan API.
In the repository, we provide a utility function to decompose an EngineCore::Model
object into multiple buffers suitable for an MDI implementation, called EngineCore::convertModel2OneBuffer()
, located in GLBLoader.cpp
.
Let’s begin by looking at the indirect draw parameters’ buffer.
The commands are stored following the same layout as the VkDrawIndexedIndirectCommand
structure:
typedef struct VkDrawIndexedIndirectCommand { uint32_t indexCount; uint32_t instanceCount; uint32_t firstIndex; int32_t vertexOffset; uint32_t firstInstance; } VkDrawIndexedIndirectCommand;
indexCount
specifies how many indices are part of this command and, in our case, is the number of indices for a mesh. One command reflects one mesh, so its instanceCount
value is one. The firstVertex
member is the index of the first index element in the buffer to use for this mesh, while vertexOffset
points to the first vertex element in the buffer to use. An example with the correct offsets is shown in Figure 2.12.
Once the vertex, index, and indirect commands buffers are bound, calling vkCmdDrawIndexedIndirect
consists of providing the buffer with the indirect commands and an offset into the buffer. The rest is done by the device:
VkCommandBuffer commandBuffer; // Valid Command Bufer VkBuffer indirectCmdBuffer; // Valid buffer w/ // indirect commands uint32_t meshCount; // Number of indirect commands in // the buffer uint32_t offset = 0; // Offset into the indirect commands // buffer vkCmdDrawIndexedIndirect( commandBuffer, indirectCmdBuffer, offset, meshCount, sizeof(VkDrawIndexedIndirectDrawCommand));
In this recipe, we learned how to utilize vkCmdDrawIndexedIndirect
, a key function in Vulkan that allows for high-efficiency drawing.
The PVP technique allows vertex data and their attributes to be extracted from buffers with custom code instead of relying on the pipeline to provide them to vertex shaders.
We will use the following structures to perform the extraction of vertex data – the Vertex
structure, which encodes the vertex’s position (pos
), normal
, UV coordinates (uv
), and its material index (material
):
struct Vertex { vec3 pos; vec3 normal; vec2 uv; int material; };
We will also use a buffer object, referred to in the shader as VertexBuffer
:
layout(set = 2, binding = 0) readonly buffer VertexBuffer { Vertex vertices[]; } vertexBuffer;
Next, we will learn how to use the vertexBuffer
object to access vertex data.
The shader code used to access the vertex data looks like this:
void main() { Vertex vertex = vertexBuffer.vertices[gl_VertexIndex]; }
Note that the vertex and its attributes are not declared as inputs to the shader. gl_VertexIndex
is automatically computed and provided to the shader based on the draw call and the parameters recorded in the indirect command retrieved from the indirect command buffer.
Index and vertex buffers
Note that both the index and vertex buffers are still provided and bound to the pipeline before the draw call is issued. The index buffer must have the VK_BUFFER_USAGE_INDEX_BUFFER_BIT
flag enabled for the technique to work.
In this recipe, we will delve into the practical application of dynamic rendering in Vulkan to enhance the flexibility of the rendering pipeline. We will guide you through the process of creating pipelines without the need for render passes and framebuffers and discuss how to ensure synchronization. By the end of this section, you will have learned how to implement this feature in your projects, thereby simplifying your rendering process by eliminating the need for render passes and framebuffers and giving you more direct control over synchronization.
To enable the feature, we must have access to the VK_KHR_get_physical_device_properties2
instance extension, instantiate a structure of type VkPhysicalDeviceDynamicRenderingFeatures
, and set its dynamicRendering
member to true
:
const VkPhysicalDeviceDynamicRenderingFeatures dynamicRenderingFeatures = { .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DYNAMIC_RENDERING_FEATURES, .dynamicRendering = VK_TRUE, };
This structure needs to be plugged into the VkDeviceCreateInfo::pNext
member when creating a Vulkan device:
const VkDeviceCreateInfo dci = { .sType = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO, .pNext = &dynamicRenderingFeatures, ... };
Having grasped the concept of enabling dynamic rendering, we will now move forward and explore its implementation using the Vulkan API.
Instead of creating render passes and framebuffers, we must call the vkCmdBeginRendering
command and provide the attachments and their load and store operations using the VkRenderingInfo
structure. Each attachment (colors, depth, and stencil) must be specified with instances of the VkRenderingAttachmentInfo
structure. Figure 2.13 presents a diagram of the structure participating in a call to vkCmdBeginRendering
:
Figure 2.13 – Dynamic rendering structure diagram
Any one of the attachments, pColorAttachments
, pDepthAttachment
, and pStencilAttachment
, can be null
. Shader output written to location x
is written to the color attachment at pColorAttachment[x]
.
In this recipe, we will demonstrate how to transfer resources between queue families by uploading textures to a device from the CPU using a transfer queue and generating mip-level data in a graphics queue. Generating mip levels needs a graphics queue because it utilizes vkCmdBlitImage
, supported only by graphics queues.
An example is provided in the repository in chapter2/mainMultiDrawIndirect.cpp
, which uses the EngineCore::AsyncDataUploader
class to perform texture upload and mipmap generation on different queues.
In the following diagram, we illustrate the procedure of uploading texture through a transfer queue, followed by the utilization of a graphics queue for mip generation:
Figure 2.14 – Recoding and submitting commands from different threads and transferring a resource between queues from different families
The process can be summarized as follows:
VkDependencyInfo
and VkImageMemoryBarrier2
structures, specifying the source queue family as the family of the transfer queue and the destination queue family as the family of the graphics queue.VkDependencyInfo
and VkImageMemoryBarrier2
structures.SubmitInfo
structure when submitting the command buffer for processing. The semaphore will be signaled when the first command buffer has completed, allowing the mip-level-generation command buffer to start.Two auxiliary methods will help us create acquire and release barriers for a texture. They exist in the VulkanCore::Texture
class. The first one creates an acquire barrier:
void Texture::addAcquireBarrier( VkCommandBuffer cmdBuffer, uint32_t srcQueueFamilyIndex, uint32_t dstQueueFamilyIndex) { VkImageMemoryBarrier2 acquireBarrier = { .sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER_2, .dstStageMask = VK_PIPELINE_STAGE_2_FRAGMENT_SHADER_BIT, .dstAccessMask = VK_ACCESS_2_MEMORY_READ_BIT, .srcQueueFamilyIndex = srcQueueFamilyIndex, .dstQueueFamilyIndex = dstQueueFamilyIndex, .image = image_, .subresourceRange = {VK_IMAGE_ASPECT_COLOR_BIT, 0, mipLevels_, 0, 1}, }; VkDependencyInfo dependency_info{ .sType = VK_STRUCTURE_TYPE_DEPENDENCY_INFO, .imageMemoryBarrierCount = 1, .pImageMemoryBarriers = &acquireBarrier, }; vkCmdPipelineBarrier2(cmdBuffer, &dependency_info); }
Besides the command buffer, this function requires the indices of the source and destination family queues. It also assumes a few things, such as the subresource range spanning the entire image.
void Texture::addReleaseBarrier( VkCommandBuffer cmdBuffer, uint32_t srcQueueFamilyIndex, uint32_t dstQueueFamilyIndex) { VkImageMemoryBarrier2 releaseBarrier = { .sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER_2, .srcStageMask = VK_PIPELINE_STAGE_2_TRANSFER_BIT, .srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT, .dstAccessMask = VK_ACCESS_SHADER_READ_BIT, .srcQueueFamilyIndex = srcQueueFamilyIndex, .dstQueueFamilyIndex = dstQueueFamilyIndex, .image = image_, .subresourceRange = {VK_IMAGE_ASPECT_COLOR_BIT, 0, mipLevels_, 0, 1}, }; VkDependencyInfo dependency_info{ .sType = VK_STRUCTURE_TYPE_DEPENDENCY_INFO, .imageMemoryBarrierCount = 1, .pImageMemoryBarriers = &releaseBarrier, }; vkCmdPipelineBarrier2(cmdBuffer, &dependency_info); }
This method makes the same assumptions as the previous one. The main differences are the source and destination stages and access masks.
VulkanCore::CommandQueueManager
, one for the transfer queue and another for the graphics queue:auto transferQueueMgr = context.createTransferCommandQueue( 1, 1, "transfer queue"); auto graphicsQueueMgr = context.createGraphicsCommandQueue( 1, 1, "graphics queue");
VulkanCore::Context
and VulkanCore::Texture
instances in hand, we can upload the texture by retrieving a command buffer from the transfer family. We also create a staging buffer for transferring the texture data to device-local memory:VulkanCore::Context context; // Valid Context std::shared_ptr<VulkanCore::Texture> texture; // Valid Texture void* textureData; // Valid texture data // Upload texture auto textureUploadStagingBuffer = context.createStagingBuffer( texture->vkDeviceSize(), VK_BUFFER_USAGE_TRANSFER_SRC_BIT, "texture upload staging buffer"); const auto commandBuffer = transferQueueMgr.getCmdBufferToBegin(); texture->uploadOnly(commandBuffer, textureUploadStagingBuffer.get(), textureData); texture->addReleaseBarrier( commandBuffer, transferQueueMgr.queueFamilyIndex(), graphicsQueueMgr.queueFamilyIndex()); transferQueueMgr.endCmdBuffer(commandBuffer); transferQueueMgr.disposeWhenSubmitCompletes( std::move(textureUploadStagingBuffer));
VkSemaphore graphicsSemaphore; const VkSemaphoreCreateInfo semaphoreInfo{ .sType = VK_STRUCTURE_TYPE_SEMAPHORE_CREATE_INFO, }; VK_CHECK(vkCreateSemaphore(context.device(), &semaphoreInfo, nullptr, &graphicsSemaphore)); VkPipelineStageFlags flags = VK_PIPELINE_STAGE_TRANSFER_BIT; auto submitInfo = context.swapchain()->createSubmitInfo( &commandBuffer, &flags, false, false); submitInfo.signalSemaphoreCount = 1; submitInfo.pSignalSemaphores = &graphicsSemaphore; transferQueueMgr.submit(&submitInfo);
// Generate mip levels auto commandBuffer = graphicsQueueMgr.getCmdBufferToBegin(); texture->addAcquireBarrier( commandBuffer, transferCommandQueueMgr_.queueFamilyIndex(), graphicsQueueMgr.queueFamilyIndex()); texture->generateMips(commandBuffer); graphicsQueueMgr.endCmdBuffer(commandBuffer); VkPipelineStageFlags flags = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT; auto submitInfo = context_.swapchain()->createSubmitInfo( &commandBuffer, &flags, false, false); submitInfo.pWaitSemaphores = &graphicsSemaphore; submitInfo.waitSemaphoreCount = 1;
In this chapter, we have navigated the complex landscape of advanced Vulkan programming, building upon the foundational concepts introduced earlier. Our journey encompassed a diverse range of topics, each contributing crucial insights to the realm of high-performance graphics applications. From mastering Vulkan’s intricate memory model and efficient allocation techniques to harnessing the power of the VMA library, we’ve equipped ourselves with the tools to optimize memory management. We explored the creation and manipulation of buffers and images, uncovering strategies for seamless data uploads, staging buffers, and ring-buffer implementations that circumvent data races. The utilization of pipeline barriers to synchronize data access was demystified, while techniques for rendering pipelines, shader customization via specialization constants, and cutting-edge rendering methodologies such as PVP and MDI were embraced. Additionally, we ventured into dynamic rendering approaches without relying on render passes and addressed the intricacies of resource handling across multiple threads and queues. With these profound understandings, you are primed to create graphics applications that harmonize technical prowess with artistic vision using the Vulkan API.
A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content
To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.
Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.
Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.
If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team.
We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.
When we publish the book, the code files will also be available to download from the Packt website.
The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.
We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.
Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.
Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.
Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.
Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.