Vulkan can be a very daunting graphics API to get started with. It is very verbose and explicit, requiring the programmer to set up many objects before they can even begin rendering to the screen. This set up can be broken down into simpler steps, making it easier to understand and get started.
A bare bones Vulkan set up can be simplified down into the steps:
- Create the window
- Create a Vulkan instance
- Enumerate the available GPUs
- Create a Vulkan device
- Create a graphics queue
- Initialize memory heaps/buffers
- Create the swapchain
- Create the render pass
- Create the framebuffers
Once that is set up, we can implement the actual rendering logic for Hello Triangle as follows:
- Create/upload the triangle vertex buffer
- Create the vertex/fragment shaders
- Create the graphics pipeline
- Implement the main loop and render function
The complete source code for this tutorial can be found in this github repository.
We will use the SDL library to create the window. SDL is a very popular/well supported library, and makes it very easy to create windows.
We start by initializing SDL and loading the Vulkan library, then we can create the SDL window.
1.2.
SDL_Window* Window {nullptr};3.
4.
...5.
6.
Assert(SDL_Init(SDL_INIT_EVERYTHING) ==0,"Could not initialize SDL");7.
Assert(SDL_Vulkan_LoadLibrary(nullptr) ==0,"Could not load the vulkan library");8.
9.
Window = SDL_CreateWindow(APP_NAME,SDL_WINDOWPOS_CENTERED,SDL_WINDOWPOS_CENTERED,WIDTH,HEIGHT,SDL_WINDOW_VULKAN|SDL_WINDOW_SHOWN);
Assert(Window !=nullptr,"Could not create SDL window");
After the window is created, we need to poll and handle SDL's events. The main one being SDL_QUIT
, which tells the application
the user has pressed the close button. Here we also exit if the user presses the escape key.
1.2.
boolRunning {true};3.
4.
...5.
6.
voidUpdate(void)7.
{8.
SDL_Eventevent = {0};9.
while(SDL_PollEvent(&event) !=0)10.
{11.
switch(event.type)12.
{13.
caseSDL_QUIT:14.
{15.
Running =false;// The main loop will exit once this becomes false16.
break;17.
}18.
caseSDL_KEYDOWN:19.
{20.
switch(event.key.keysym.scancode)21.
{22.
caseSDL_SCANCODE_ESCAPE:23.
Running =false;// The main loop will exit once this becomes false24.
break;25.
default:26.
break;27.
}28.
break;29.
}30.
default:31.
break;32.
}33.
}
}
Before creating the Vulkan instance, the application can enumerate all the available layers/extensions available from the API/driver.
These layers/extensions can give optional functionality/features if the application requests them during instance creation.
We can enumerate the layers and extensions using the vkEnumerateDeviceLayerProperties
and vkEnumerateDeviceExtensionProperties
functions.
Normally an application will validate if its required layers/extensions are available, but here we simply print them out.
1.2.
uint32_tExtCount =0;3.
uint32_tLayerCount =0;4.
5.
Assert(vkEnumerateInstanceLayerProperties(&LayerCount,nullptr) ==VK_SUCCESS,"Could not get number of instance layers");6.
7.
std::vector<VkLayerProperties> AvailableLayers(LayerCount);8.
Assert(vkEnumerateInstanceLayerProperties(&LayerCount, AvailableLayers.data()) ==VK_SUCCESS,"Could not get instance layers");9.
10.
for(uint32_ti =0; i <= AvailableLayers.size(); i++)11.
{12.
constchar* pLayerName = (i ==0) ?nullptr: AvailableLayers[i −1].layerName;13.
14.
Assert(vkEnumerateInstanceExtensionProperties(pLayerName, &ExtCount,nullptr) ==VK_SUCCESS,"Could not get extension count for instance layer");15.
16.
std::vector<VkExtensionProperties> AvailableExtensions(ExtCount);17.
Assert(vkEnumerateInstanceExtensionProperties(pLayerName, &ExtCount, AvailableExtensions.data()) ==VK_SUCCESS,"Could not get extensions for instance layer");18.
19.
printf("Instance layer: %s\n", (pLayerName ==nullptr) ?"Global": pLayerName);20.
for(uint32_tj =0; j < AvailableExtensions.size(); j++)21.
{22.
printf("\t%s\n", AvailableExtensions[j].extensionName);23.
}24.
}
SDL requires certain extensions to be enabled when the Vulkan instance is created. We can get these using SDL_Vulkan_GetInstanceExtensions
.
1.2.
Assert(SDL_Vulkan_GetInstanceExtensions(Window, &ExtCount,nullptr) ==SDL_TRUE,"Could not get number of required SDL extensions");3.
4.
std::vector<constchar*> RequiredLayers;5.
std::vector<constchar*> RequiredExtensions(ExtCount);
Assert(SDL_Vulkan_GetInstanceExtensions(Window, &ExtCount, RequiredExtensions.data()) ==SDL_TRUE,"Could not get required SDL extensions");
If you want to be thorough, you can validate these required extensions are supported by following the code in section A.
The validation layer can help catch bad parameters, memory leaks, invalid API calls, and many other errors. It is very useful for catching bugs. To enable it, simply add it to the required layer/extension list.
1.2.
#ifdef DEBUG// Only add the validation layer/extension if this is a debug build3.
RequiredLayers.push_back("VK_LAYER_KHRONOS_validation");4.
RequiredExtensions.push_back(VK_EXT_DEBUG_REPORT_EXTENSION_NAME);
#endif
Note that enabling this validation extension is not enough, we need to do some additional set up after creating the instance. This will be covered in section 2.E.
To create the actual instance, we fill out the VkApplicationInfo
and VkInstanceCreateInfo
structures,
and call vkCreateInstance
. The application structure specifies the app and engine names/versions, and the required Vulkan API version.
The instance structure will take a pointer to the application structure and the lists of the required layers/extensions.
1.2.
VkApplicationInfoAppInfo =3.
{4.
.sType =VK_STRUCTURE_TYPE_APPLICATION_INFO,5.
.pNext =nullptr,6.
.pApplicationName =APP_NAME,7.
.applicationVersion =1,8.
.pEngineName =APP_NAME,9.
.engineVersion =1,10.
.apiVersion = VK_API_VERSION_1_011.
};12.
13.
VkInstanceCreateInfoInstanceInfo =14.
{15.
.sType =VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO,16.
.pNext =nullptr,17.
.flags =0,18.
.pApplicationInfo = &AppInfo,19.
.enabledLayerCount =static_cast<uint32_t>(RequiredLayers.size()),20.
.ppEnabledLayerNames = RequiredLayers.data(),21.
.enabledExtensionCount =static_cast<uint32_t>(RequiredExtensions.size()),22.
.ppEnabledExtensionNames = RequiredExtensions.data()23.
};24.
Assert(vkCreateInstance(&InstanceInfo,nullptr, &Instance) ==VK_SUCCESS,"Failed to create vulkan instance");
If you followed 2.C, once the instance is created, we need to set up the debug report callback which Vulkan
will use to send the debug messages. We first need to use
vkGetInstanceProcAddr
to get the function pointers to the callbacks vkCreateDebugReportCallback
and
vkDestroyDebugReportCallback
. After getting the function addresses, we can create the debug report callback
object VkDebugReportCallbackEXT
.
1.2.
#ifdef DEBUG3.
VkDebugReportCallbackEXThVkDebugReport {nullptr};4.
PFN_vkCreateDebugReportCallbackEXTvkCreateDebugReportCb {nullptr};5.
PFN_vkDestroyDebugReportCallbackEXTvkDestroyDebugReportCb {nullptr};6.
#endif7.
8.
...9.
10.
#ifdef DEBUG11.
staticVkBool32VulkanDebugReportCb(VkDebugReportFlagsEXTflags,VkDebugReportObjectTypeEXTobjectType,uint64_tobject,size_tlocation,int32_tmessageCode,constchar* pLayerPrefix,constchar* pMessage,void* pUserData)12.
{13.
printf("%s: %s\n", pLayerPrefix, pMessage);14.
returnVK_TRUE;15.
}16.
#endif17.
18.
...19.
20.
#ifdef DEBUG21.
vkCreateDebugReportCb =reinterpret_cast<PFN_vkCreateDebugReportCallbackEXT>(vkGetInstanceProcAddr(Instance,"vkCreateDebugReportCallbackEXT"));22.
vkDestroyDebugReportCb =reinterpret_cast<PFN_vkDestroyDebugReportCallbackEXT>(vkGetInstanceProcAddr(Instance,"vkDestroyDebugReportCallbackEXT"));23.
24.
Assert(vkCreateDebugReportCb !=nullptr,"Could not get debug report callback");25.
Assert(vkDestroyDebugReportCb !=nullptr,"Could not get debug report callback");26.
27.
VkDebugReportCallbackCreateInfoEXTCallbackInfo =28.
{29.
.sType =VK_STRUCTURE_TYPE_DEBUG_REPORT_CALLBACK_CREATE_INFO_EXT,30.
.pNext =nullptr,31.
.flags =VK_DEBUG_REPORT_INFORMATION_BIT_EXT|VK_DEBUG_REPORT_WARNING_BIT_EXT|VK_DEBUG_REPORT_PERFORMANCE_WARNING_BIT_EXT|32.
VK_DEBUG_REPORT_ERROR_BIT_EXT|VK_DEBUG_REPORT_DEBUG_BIT_EXT,33.
.pfnCallback = VulkanDebugReportCb,34.
.pUserData =nullptr35.
};36.
37.
Assert(vkCreateDebugReportCb(Instance, &CallbackInfo,nullptr, &hVkDebugReport) ==VK_SUCCESS,"Failed to register debug callback\n");
#endif
Vulkan gives a list of all the GPUs in the current system through the vkEnumeratePhysicalDevices
function.
1.2.
uint32_tDeviceCount =0;3.
Assert(vkEnumeratePhysicalDevices(Instance, &DeviceCount,nullptr) ==VK_SUCCESS,"Could not get number of physical devices");4.
5.
std::vector<VkPhysicalDevice> DeviceHandles(DeviceCount);
Assert(vkEnumeratePhysicalDevices(Instance, &DeviceCount, DeviceHandles.data()) ==VK_SUCCESS,"Could not get physical devices");
Once we have all the device handles, we can get the properties of each device. This includes:
- The GPU name
- The GPU type (integrated, discrete, etc.)
- The types and counts of queues available on the GPU (graphics, compute, copy, etc.)
- The type and amounts of memory available in the GPU
We will be using a simple algorithm which sorts the GPUs by their type (integrated vs discrete), the number of graphics queues available, and the amount of local memory available. After sorting the GPUs, the one with the highest preference will be selected.
1.2.
conststd::map<VkPhysicalDeviceType,uint32_t> PreferenceOrder =3.
{4.
{VK_PHYSICAL_DEVICE_TYPE_DISCRETE_GPU,2},5.
{VK_PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU,1}6.
};7.
8.
structPhysicalDeviceInfo9.
{10.
VkPhysicalDeviceHandle;11.
uint32_tOriginalIndex;12.
uint32_tPreferenceIndex;13.
uint32_tGraphicsQueueGroup;14.
uint32_tNumGraphicsQueues;15.
uint64_tLocalHeapSize;16.
};17.
18.
uint32_tQueueCount =0;19.
20.
std::vector<PhysicalDeviceInfo> PhysicalDevices;21.
22.
for(uint32_ti =0; i < DeviceHandles.size(); i++)23.
{24.
VkPhysicalDeviceFeaturesDeviceFeatures {};25.
VkPhysicalDevicePropertiesDeviceProperties {};26.
VkPhysicalDeviceMemoryPropertiesMemoryProperties {};27.
28.
PhysicalDeviceInfoDeviceInfo { DeviceHandles[i], i,0,UINT32_MAX,0,0};29.
30.
vkGetPhysicalDeviceFeatures(DeviceHandles[i], &DeviceFeatures);31.
vkGetPhysicalDeviceProperties(DeviceHandles[i], &DeviceProperties);32.
vkGetPhysicalDeviceMemoryProperties(DeviceHandles[i], &MemoryProperties);33.
34.
vkGetPhysicalDeviceQueueFamilyProperties(DeviceHandles[i], &QueueCount,nullptr);35.
36.
std::vector<VkQueueFamilyProperties> QueueGroups(QueueCount);37.
vkGetPhysicalDeviceQueueFamilyProperties(DeviceHandles[i], &QueueCount, QueueGroups.data());38.
39.
std::map<VkPhysicalDeviceType,uint32_t>::const_iteratorit = PreferenceOrder.find(DeviceProperties.deviceType);40.
if (it != PreferenceOrder.end())41.
{42.
DeviceInfo.PreferenceIndex = it−>second;43.
}44.
45.
for(uint32_tj =0; j < QueueGroups.size(); j++)46.
{47.
if (QueueGroups[j].queueFlags &VK_QUEUE_GRAPHICS_BIT)48.
{49.
DeviceInfo.GraphicsQueueGroup = std::min(DeviceInfo.GraphicsQueueGroup, j);// Pick the first (minimum) available group, we only use 1 gfx queue, so the group does not matter50.
DeviceInfo.NumGraphicsQueues += QueueGroups[j].queueCount;51.
}52.
}53.
54.
for(uint32_tj =0; j < MemoryProperties.memoryHeapCount; j++)55.
{56.
if (MemoryProperties.memoryHeaps[j].flags &VK_MEMORY_HEAP_DEVICE_LOCAL_BIT)57.
{58.
DeviceInfo.LocalHeapSize += MemoryProperties.memoryHeaps[j].size;59.
}60.
}61.
62.
PhysicalDevices.push_back(DeviceInfo);63.
}64.
65.
Assert(PhysicalDevices.size() >0,"Could not find a supported GPU");66.
67.
#define COMPARE(a, b) { if ((a) > (b)) { return true; } else if ((a) < (b)) { return false; } }68.
std::sort(PhysicalDevices.begin(), PhysicalDevices.end(),69.
[](constPhysicalDeviceInfo& lhs,constPhysicalDeviceInfo& rhs) −>bool70.
{71.
COMPARE(lhs.PreferenceIndex, rhs.PreferenceIndex);72.
COMPARE(lhs.NumGraphicsQueues, rhs.NumGraphicsQueues);73.
COMPARE(lhs.LocalHeapSize, rhs.LocalHeapSize);74.
returnfalse;75.
}76.
);77.
#undef COMPARE78.
79.
PhysicalDevice = PhysicalDevices[0].Handle;
GraphicsQueueGroup = PhysicalDevices[0].GraphicsQueueGroup;
Note we also store the graphics queue group index because this will be used to create the graphics queue later on.
Once the physical device is determined, we have to create a logical device.
Similar to when we had created the instance, we can get the device's available layers/extensions using the functions vkEnumerateDeviceLayerProperties
and vkEnumerateDeviceExtensionProperties
.
1.2.
uint32_tExtCount =0;3.
uint32_tLayerCount =0;4.
5.
Assert(vkEnumerateDeviceLayerProperties(PhysicalDevice, &ExtCount,nullptr) ==VK_SUCCESS,"Failed to get number of device layers");6.
7.
std::vector<VkLayerProperties> AvailableLayers(ExtCount);8.
Assert(vkEnumerateDeviceLayerProperties(PhysicalDevice, &ExtCount, AvailableLayers.data()) ==VK_SUCCESS,"Failed to get device layers");9.
10.
for(uint32_ti =0; i <= AvailableLayers.size(); i++)11.
{12.
constchar* pLayerName = (i ==0) ?nullptr: AvailableLayers[i −1].layerName;13.
14.
Assert(vkEnumerateDeviceExtensionProperties(PhysicalDevice, pLayerName, &LayerCount,nullptr) ==VK_SUCCESS,"Could not get extension count for instance layer");15.
16.
std::vector<VkExtensionProperties> AvailableExtensions(LayerCount);17.
Assert(vkEnumerateDeviceExtensionProperties(PhysicalDevice, pLayerName, &LayerCount, AvailableExtensions.data()) ==VK_SUCCESS,"Could not get extensions for instance layer");18.
19.
printf("Device layer: %s\n", (pLayerName ==nullptr) ?"Global": pLayerName);20.
for(uint32_tj =0; j < AvailableExtensions.size(); j++)21.
{22.
printf("\t%s\n", AvailableExtensions[j].extensionName);23.
}
}
Note that we require the VK_KHR_SWAPCHAIN_EXTENSION_NAME
extension when creating the logical device, because we will be creating a swapchain on this device.
Once we have figured out which layers/extensions are available and which ones we need, we can create the device. Note that we also have to request the queues we will be using
at the device creation time in the VkDeviceQueueCreateInfo
structure. Here we only request the one graphics queue from the group we picked in section 3.
1.2.
std::vector<constchar*> RequiredExtensions {VK_KHR_SWAPCHAIN_EXTENSION_NAME};3.
4.
constfloatQueuePriority =1.0f;5.
6.
VkDeviceQueueCreateInfoQueueInfo =7.
{8.
.sType =VK_STRUCTURE_TYPE_DEVICE_QUEUE_CREATE_INFO,9.
.pNext =nullptr,10.
.flags =0,11.
.queueFamilyIndex = GraphicsQueueGroup,12.
.queueCount =1,13.
.pQueuePriorities = &QueuePriority14.
};15.
16.
VkDeviceCreateInfoDeviceInfo =17.
{18.
.sType =VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO,19.
.pNext =nullptr,20.
.flags =0,21.
.queueCreateInfoCount =1,22.
.pQueueCreateInfos = &QueueInfo,23.
.enabledLayerCount =0,24.
.ppEnabledLayerNames =nullptr,25.
.enabledExtensionCount =static_cast<uint32_t>(RequiredExtensions.size()),26.
.ppEnabledExtensionNames = RequiredExtensions.data(),27.
.pEnabledFeatures =nullptr28.
};29.
Assert(vkCreateDevice(PhysicalDevice, &DeviceInfo,nullptr, &Device) ==VK_SUCCESS,"Could not create vk device");
We already created the graphics queue in section 4, so we can simply get a handle to it using vkGetDeviceQueue
.
1.2.
VkQueueGraphicsQueue {nullptr};3.
4.
...5.
6.
vkGetDeviceQueue(Device, GraphicsQueueGroup,0, &GraphicsQueue);
Assert(GraphicsQueue !=nullptr,"Could not get gfx queue 0");
However, the queue still needs a couple more objects for us to be able to use it. Those objects are:
- A command pool
- A command buffer
- A fence
A command pool, represented by the structure VkCommandPool
, is used to allocate command buffers. Command buffers, represented by the structure
VkCommandBuffer
, are used to record commands for rendering and doing other graphics operations. A command pool can be used to allocate many command buffers,
but we will only need one for our simple app. And lastly, the fence is used for syncronizing work between the CPU and GPU. We will give Vulkan this fence object
when we submit work to the GPU, and this fence will get signalled once the workload finishes. This will let us syncronize the CPU by waiting on the fence for the submission to finish.
1.2.
VkCommandPoolCommandPool {nullptr};3.
VkCommandBufferCommandBuffer {nullptr};4.
VkFenceFence {nullptr};5.
6.
...7.
8.
VkCommandPoolCreateInfoCommandPoolInfo =9.
{10.
.sType =VK_STRUCTURE_TYPE_COMMAND_POOL_CREATE_INFO,11.
.pNext =nullptr,12.
.flags =VK_COMMAND_POOL_CREATE_TRANSIENT_BIT|VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT,13.
.queueFamilyIndex = GraphicsQueueGroup14.
};15.
16.
Assert(vkCreateCommandPool(Device, &CommandPoolInfo,nullptr, &CommandPool) ==VK_SUCCESS,"Could not create the command pool");17.
18.
VkCommandBufferAllocateInfoCommandBufferInfo =19.
{20.
.sType =VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO,21.
.pNext =nullptr,22.
.commandPool = CommandPool,23.
.level =VK_COMMAND_BUFFER_LEVEL_PRIMARY,24.
.commandBufferCount =125.
};26.
27.
Assert(vkAllocateCommandBuffers(Device, &CommandBufferInfo, &CommandBuffer) ==VK_SUCCESS,"Could not create the command buffer");28.
29.
VkFenceCreateInfoFenceInfo =30.
{31.
.sType =VK_STRUCTURE_TYPE_FENCE_CREATE_INFO,32.
.pNext =nullptr,33.
.flags =034.
};35.
Assert(vkCreateFence(Device, &FenceInfo,nullptr, &Fence) ==VK_SUCCESS,"Failed to create fence");
Memory management is a very important concept in Vulkan. The developer is responsible for figuring out which parts of the memory to use, how to manage the memory usage, etc. For this example we are only concerned with two types of memory:
- Device local memory (the VRAM inside the GPU)
- Host visible memory (CPU visible memory)
These categories are not mutually exclusive. Most discrete GPUs have their own local video memory (VRAM) which is optimal for GPU access, but only a portion of that is accessible from the CPU. Integrated GPUs on the other hand, can see the entire system memory as both local and CPU visible. Resizable bar can also give the CPU access to the entire GPU VRAM, but here we assume that is not being used. A very common and simple solution is to use the large local CPU invisible memory as the primary heap, and use the small local CPU visible memory as a staging area. Data will first be copied to the staging area using the CPU, and then transferred from the staging area to the primary heap using the GPU.
We first need to get the device's memory properties, using vkGetPhysicalDeviceMemoryProperties
, and then we can figure out the most optimal heaps available on the GPU.
1.2.
uint32_tPrimaryHeapIndex {UINT32_MAX};3.
uint32_tUploadHeapIndex {UINT32_MAX};4.
5.
VkPhysicalDeviceMemoryPropertiesMemoryProperties {};6.
7.
...8.
9.
vkGetPhysicalDeviceMemoryProperties(PhysicalDevice, &MemoryProperties);10.
11.
for(uint32_ti =0; i < MemoryProperties.memoryTypeCount; i++)12.
{13.
uint64_tHeapSize = MemoryProperties.memoryHeaps[MemoryProperties.memoryTypes[i].heapIndex].size;14.
15.
if (MemoryProperties.memoryTypes[i].propertyFlags &VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT)16.
{17.
if ((PrimaryHeapIndex ==UINT32_MAX) || (HeapSize > MemoryProperties.memoryHeaps[PrimaryHeapIndex].size))18.
{19.
PrimaryHeapIndex = MemoryProperties.memoryTypes[i].heapIndex;20.
}21.
}22.
23.
if ((MemoryProperties.memoryTypes[i].propertyFlags &VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT)24.
&& (MemoryProperties.memoryTypes[i].propertyFlags &VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT))25.
{26.
if ((UploadHeapIndex ==UINT32_MAX) || (HeapSize > MemoryProperties.memoryHeaps[UploadHeapIndex].size))27.
{28.
UploadHeapIndex = MemoryProperties.memoryTypes[i].heapIndex;29.
UploadBufferSize = std::min(ALIGN(HeapSize /4,MB),static_cast<uint64_t>(16*MB));30.
}31.
}32.
}33.
34.
Assert(PrimaryHeapIndex !=UINT32_MAX,"Could not find primary heap");
Assert(UploadHeapIndex !=UINT32_MAX,"Could not find upload heap");
Once we have the primary and upload heaps figured out, we can allocate a large buffer in the upload heap, which will be used as an intermediate buffer before
being transferred to the primary heap for future resources. Note that after we create the upload buffer, we also have to map it to the CPU address space using vkMapMemory
, so that the
CPU can access it.
1.2.
VkBufferUploadBuffer {nullptr};3.
VkDeviceMemoryUploadBufferMemory {nullptr};4.
5.
uint64_tUploadBufferSize {0};6.
void* UploadBufferCpuVA {nullptr};7.
8.
...9.
10.
VkBufferCreateInfoUploadBufferInfo =11.
{12.
.sType =VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO,13.
.pNext =nullptr,14.
.flags =0,15.
.size = UploadBufferSize,16.
.usage =VK_BUFFER_USAGE_TRANSFER_SRC_BIT,17.
.sharingMode =VK_SHARING_MODE_EXCLUSIVE,18.
.queueFamilyIndexCount =0,19.
.pQueueFamilyIndices =nullptr20.
};21.
22.
Assert(vkCreateBuffer(Device, &UploadBufferInfo,nullptr, &UploadBuffer) ==VK_SUCCESS,"Failed to create upload buffer");23.
24.
VkMemoryRequirementsUploadBufferRequirements = {};25.
vkGetBufferMemoryRequirements(Device, UploadBuffer, &UploadBufferRequirements);26.
27.
AllocateMemory(UploadBufferRequirements,VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT&VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, UploadHeapIndex, UploadBufferMemory);28.
Assert(vkBindBufferMemory(Device, UploadBuffer, UploadBufferMemory,0) ==VK_SUCCESS,"Failed to bind upload buffer memory");
Assert(vkMapMemory(Device, UploadBufferMemory,0, UploadBufferSize,0, &UploadBufferCpuVA) ==VK_SUCCESS,"Failed to map upload buffer memory");
This is the helper function used above, which allocates memory directly from a heap given its index. This will be useful for other allocations as well, such as the vertex buffer, and any other future allocations we need.
1.2.
voidAllocateMemory(constVkMemoryRequirements& rMemoryRequirements,VkMemoryPropertyFlagsFlags,uint32_tHeapIndex, VkDeviceMemory& rMemory)const3.
{4.
for(uint32_ti =0; i < MemoryProperties.memoryTypeCount; i++)5.
{6.
if ((rMemoryRequirements.memoryTypeBits & (1<< i))7.
&& ((MemoryProperties.memoryTypes[i].propertyFlags & Flags) == Flags)8.
&& (MemoryProperties.memoryTypes[i].heapIndex == HeapIndex))9.
{10.
VkMemoryAllocateInfoAllocationInfo =11.
{12.
.sType =VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO,13.
.pNext =nullptr,14.
.allocationSize = rMemoryRequirements.size,15.
.memoryTypeIndex = i16.
};17.
18.
Assert(vkAllocateMemory(Device, &AllocationInfo,nullptr, &rMemory) ==VK_SUCCESS,"Failed to allocate vertex buffer memory");19.
break;20.
}21.
}22.
23.
Assert(rMemory !=nullptr,"Unable to allocate memory");
}
The swapchain contains the sequence of images which we will render to and present on the screen.
Before we create the swapchain, we need to create a Vulkan surface for our SDL window. This surface will let Vulkan render on SDL's window,
and will be required for creating the swapchain. SDL provides a function SDL_Vulkan_CreateSurface
to create this.
1.2.
VkSurfaceKHRSurface {nullptr};3.
4.
...5.
Assert(SDL_Vulkan_CreateSurface(Window, Instance, &Surface) ==VK_TRUE,"Failed to create surface");
After we create the surface, we need to check if our required surface formats and presentation mode are supported. The surface format
is the format of the swapchain images, in our case VK_FORMAT_B8G8R8A8_UNORM
, which gives 8 bits to each red/green/blue/alpha component.
The presentation mode controls how the images of the swapchain are presented. We will be using VK_PRESENT_MODE_FIFO_KHR
, which will present the swapchain images one by one in a queue
and will only present an image once the previous one has been fully presented (i.e. vertically synced).
1.2.
uint32_tPresentModeCount =0;3.
uint32_tSurfaceFormatCount =0;4.
5.
Assert(vkGetPhysicalDeviceSurfacePresentModesKHR(PhysicalDevice, Surface, &PresentModeCount,nullptr) ==VK_SUCCESS,"Could not get the number of supported presentation modes");6.
Assert(vkGetPhysicalDeviceSurfaceFormatsKHR(PhysicalDevice, Surface, &SurfaceFormatCount,nullptr) ==VK_SUCCESS,"Could not get the number of supported surface formats");7.
8.
std::vector<VkPresentModeKHR> PresentModes(PresentModeCount);9.
std::vector<VkSurfaceFormatKHR> SurfaceFormats(SurfaceFormatCount);10.
11.
Assert(vkGetPhysicalDeviceSurfacePresentModesKHR(PhysicalDevice, Surface, &PresentModeCount, PresentModes.data()) ==VK_SUCCESS,"Could not get the supported presentation modes");12.
Assert(vkGetPhysicalDeviceSurfaceFormatsKHR(PhysicalDevice, Surface, &SurfaceFormatCount, SurfaceFormats.data()) ==VK_SUCCESS,"Could not get the number of supported surface formats");13.
14.
for(std::vector<VkSurfaceFormatKHR>::const_iteratorit = SurfaceFormats.begin(); it != SurfaceFormats.end(); it++)15.
{16.
if (it−>format ==VK_FORMAT_B8G8R8A8_UNORM) { SurfaceFormat = *it;break; }17.
}18.
19.
Assert(SurfaceFormat.format !=VK_FORMAT_UNDEFINED,"Could not find required surface format");
Assert(std::find(PresentModes.begin(), PresentModes.end(),VK_PRESENT_MODE_FIFO_KHR) != PresentModes.end(),"Could not find required present mode");
We also need to tell the swapchain the size of the images to use and the minimum numbers of swapchain images to create. For that information we first call vkGetPhysicalDeviceSurfaceCapabilitiesKHR
to get the surface capabilities, which will tell us the current surface's dimensions in the currentExtent
field, and the minimum number of images required in the minImageCount
field.
Once we have this information, we can create the swapchain.
1.2.
VkSurfaceCapabilitiesKHRSurfaceCapabilities = {0};3.
Assert(vkGetPhysicalDeviceSurfaceCapabilitiesKHR(PhysicalDevice, Surface, &SurfaceCapabilities) ==VK_SUCCESS,"Could not get surface capabilities");4.
5.
VkSwapchainCreateInfoKHRSwapchainInfo =6.
{7.
.sType =VK_STRUCTURE_TYPE_SWAPCHAIN_CREATE_INFO_KHR,8.
.pNext =nullptr,9.
.flags =0,10.
.surface = Surface,11.
.minImageCount = SurfaceCapabilities.minImageCount,12.
.imageFormat = SurfaceFormat.format,13.
.imageColorSpace = SurfaceFormat.colorSpace,14.
.imageExtent =15.
{16.
.width = SurfaceCapabilities.currentExtent.width,17.
.height = SurfaceCapabilities.currentExtent.height18.
},19.
.imageArrayLayers =1,20.
.imageUsage =VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT,21.
.imageSharingMode =VK_SHARING_MODE_EXCLUSIVE,22.
.queueFamilyIndexCount =0,23.
.pQueueFamilyIndices =nullptr,24.
.preTransform =VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR,25.
.compositeAlpha =VK_COMPOSITE_ALPHA_OPAQUE_BIT_KHR,26.
.presentMode =VK_PRESENT_MODE_FIFO_KHR,27.
.clipped =VK_TRUE,28.
.oldSwapchain =nullptr29.
};30.
Assert(vkCreateSwapchainKHR(Device, &SwapchainInfo,nullptr, &Swapchain) ==VK_SUCCESS,"Failed to create swapchain");
After we create the swapchain, we can get the swapchain's images using vkGetSwapchainImagesKHR
. This will be
useful for section 9 when we create the framebuffers.
1.2.
enum3.
{4.
MinSwapchainImages =2,5.
MaxSwapchainImages =46.
};7.
8.
VkImageSwapchainImages[MaxSwapchainImages] {};9.
10.
...11.
12.
Assert(vkGetSwapchainImagesKHR(Device, Swapchain, &NumSwapchainImages,nullptr) ==VK_SUCCESS,"Could not get number of swapchain images");13.
Assert((NumSwapchainImages >= MinSwapchainImages) && (NumSwapchainImages <= MaxSwapchainImages),"Invalid number of swapchain images");
Assert(vkGetSwapchainImagesKHR(Device, Swapchain, &NumSwapchainImages, SwapchainImages) ==VK_SUCCESS,"Could not get swapchain images");
Lastly, we need to create semaphores to syncronize access to the swapchain's images. We will need two of them in section 13 - one for waiting for access to the swapchain's image before
rendering, and one for waiting for the swapchain's image to become ready to be presented after rendering is finished. The semaphores can be created using
vkCreateSemaphore
.
1.2.
VkSemaphoreCreateInfoSemaphoreInfo =3.
{4.
.sType =VK_STRUCTURE_TYPE_SEMAPHORE_CREATE_INFO,5.
.pNext =nullptr,6.
.flags =07.
};8.
9.
Assert(vkCreateSemaphore(Device, &SemaphoreInfo,nullptr, &AcquireSemaphore) ==VK_SUCCESS,"Failed to create semaphore");
Assert(vkCreateSemaphore(Device, &SemaphoreInfo,nullptr, &ReleaseSemaphore) ==VK_SUCCESS,"Failed to create semaphore");
The render pass is used to describe the render targets/attachments the current rendering workload will use. This object will be is necessary to create the framebuffer, and also required by Vulkan in the main rendering logic since rendering operations can only be done in render passes. In this example, we will only be rendering to the swapchain surfaces.
We begin by describing the attachment format, the color/depth/stencil buffer content load/store behaviours at the beginning and end of the render pass,
and the image layout at the beginning/end of the render pass. This information is specified an array of VkAttachmentDescription
structures,
and one is needed for each attachment the render pass will use. In our case, we will only have one color attachment, which will be the swapchain surface.
1.2.
VkAttachmentDescriptionAttachmentDescriptions[] =3.
{4.
{// Color attachment5.
.flags =0,6.
.format = SurfaceFormat.format,7.
.samples =VK_SAMPLE_COUNT_1_BIT,8.
.loadOp =VK_ATTACHMENT_LOAD_OP_CLEAR,9.
.storeOp =VK_ATTACHMENT_STORE_OP_STORE,10.
.stencilLoadOp =VK_ATTACHMENT_LOAD_OP_DONT_CARE,11.
.stencilStoreOp =VK_ATTACHMENT_STORE_OP_DONT_CARE,12.
.initialLayout =VK_IMAGE_LAYOUT_UNDEFINED,// image layout undefined at the beginning of the render pass13.
.finalLayout =VK_IMAGE_LAYOUT_PRESENT_SRC_KHR// prepare the color attachment for presentation14.
}
};
After describing the attachments, we must describe the subpasses. We will only have one pass, so we only define one VkSubpassDescription
.
1.2.
VkAttachmentReferenceColorAttachments[] =3.
{4.
{// Color attachment5.
.attachment =0,6.
.layout =VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL// image layout is VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL during render pass7.
}8.
};9.
10.
VkSubpassDescriptionSubpassDescription =11.
{12.
.flags =0,13.
.pipelineBindPoint =VK_PIPELINE_BIND_POINT_GRAPHICS,14.
.inputAttachmentCount =0,15.
.pInputAttachments =nullptr,16.
.colorAttachmentCount =1,17.
.pColorAttachments = ColorAttachments,18.
.pResolveAttachments =nullptr,19.
.pDepthStencilAttachment =nullptr,20.
.preserveAttachmentCount =0,21.
.pPreserveAttachments =nullptr
};
Once these objects are prepared, the render pass can be created.
1.2.
VkRenderPassCreateInfoRenderPassInfo =3.
{4.
.sType =VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO,5.
.pNext =nullptr,6.
.flags =0,7.
.attachmentCount =sizeof(AttachmentDescriptions) /sizeof(VkAttachmentDescription),8.
.pAttachments = AttachmentDescriptions,9.
.subpassCount =1,10.
.pSubpasses = &SubpassDescription,11.
.dependencyCount =0,12.
.pDependencies =nullptr13.
};14.
Assert(vkCreateRenderPass(Device, &RenderPassInfo,nullptr, &RenderPass) ==VK_SUCCESS,"Failed to create render pass");
After the swapchain and renderpass are created, we must create a framebuffer for each of the swapchain images.
We begin by creating a VkImageView
for each swapchain image, which is necessary for each framebuffer.
1.2.
VkImageViewSwapchainImageViews[MaxSwapchainImages] {};3.
4.
...5.
6.
for(uint32_ti =0; i < NumSwapchainImages; i++)7.
{8.
VkImageViewCreateInfoImageViewInfo =9.
{10.
.sType =VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,11.
.pNext =nullptr,12.
.flags =0,13.
.image = SwapchainImages[i],14.
.viewType =VK_IMAGE_VIEW_TYPE_2D,15.
.format = SurfaceFormat.format,16.
.components =17.
{18.
.r =VK_COMPONENT_SWIZZLE_IDENTITY,19.
.g =VK_COMPONENT_SWIZZLE_IDENTITY,20.
.b =VK_COMPONENT_SWIZZLE_IDENTITY,21.
.a =VK_COMPONENT_SWIZZLE_IDENTITY22.
},23.
.subresourceRange =24.
{25.
.aspectMask =VK_IMAGE_ASPECT_COLOR_BIT,26.
.baseMipLevel =0,27.
.levelCount =1,28.
.baseArrayLayer =0,29.
.layerCount =130.
}31.
};32.
33.
Assert(vkCreateImageView(Device, &ImageViewInfo,nullptr, &SwapchainImageViews[i]) ==VK_SUCCESS,"Failed to create image view");
}
After we have created all the necessary image views, we can create the framebuffers.
1.2.
VkFramebufferFramebuffers[MaxSwapchainImages] {};3.
4.
...5.
6.
for(uint32_ti =0; i < NumSwapchainImages; i++)7.
{8.
VkImageViewFramebufferAttachments[] =9.
{10.
SwapchainImageViews[i]11.
};12.
13.
VkFramebufferCreateInfoFramebufferInfo =14.
{15.
.sType =VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO,16.
.pNext =nullptr,17.
.flags =0,18.
.renderPass = RenderPass,19.
.attachmentCount =sizeof(FramebufferAttachments) /sizeof(VkImageView),20.
.pAttachments = FramebufferAttachments,21.
.width =WIDTH,22.
.height =HEIGHT,23.
.layers =124.
};25.
26.
Assert(vkCreateFramebuffer(Device, &FramebufferInfo,nullptr, &Framebuffers[i]) ==VK_SUCCESS,"Failed to create framebuffer");
}
At this point, the required Vulkan objects have been set up to provide an equivalent "context" like what OpenGL would provide, and we can start implementing the Hello Triangle logic. The first step is to create the vertex buffer for the triangle we will be rendering.
We begin by defining the actual vertices. Each vertex has 2 attributes, the position and color, both of which are 3D floating point values.
1.2.
structVertex3.
{4.
floatposition[3];5.
floatcolor[3];6.
};7.
8.
staticconstexprconstVertexTriangleVertices[] =9.
{10.
{// vertex 011.
{ −0.8f, +0.8f,0.0f},// position12.
{0.0f,0.0f,1.0f}// color13.
},14.
{// vertex 115.
{ +0.8f, +0.8f,0.0f},// position16.
{0.0f,1.0f,0.0f}// color17.
},18.
{// vertex 219.
{0.0f, −0.8f,0.0f},// position20.
{1.0f,0.0f,0.0f}// color21.
}
};
The next step is to create the buffer, allocate the memory, and bind the memory to that new buffer. Vulkan requires the buffer creation and memory allocation to be seperate because
it gives the programmer flexibility for memory management. For example, if we want, we can create a single memory allocation and sub-allocate that amoungst different buffers. Note that we
are using the AllocateMemory
helper function we created in section 6 to allocate the memory in our primary heap.
1.2.
VkBufferCreateInfoBufferInfo =3.
{4.
.sType =VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO,5.
.pNext =nullptr,6.
.flags =0,7.
.size =sizeof(TriangleVertices),8.
.usage =VK_BUFFER_USAGE_VERTEX_BUFFER_BIT|VK_BUFFER_USAGE_TRANSFER_DST_BIT,9.
.sharingMode =VK_SHARING_MODE_EXCLUSIVE,10.
.queueFamilyIndexCount =0,11.
.pQueueFamilyIndices =nullptr12.
};13.
14.
Assert(vkCreateBuffer(Device, &BufferInfo,nullptr, &VertexBuffer) ==VK_SUCCESS,"Failed to create vertex buffer");15.
16.
VkMemoryRequirementsBufferRequirements = {};17.
vkGetBufferMemoryRequirements(Device, VertexBuffer, &BufferRequirements);18.
19.
AllocateMemory(BufferRequirements, PrimaryHeapIndex,VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, VertexBufferMemory);
Assert(vkBindBufferMemory(Device, VertexBuffer, VertexBufferMemory,0) ==VK_SUCCESS,"Failed to bind vertex buffer memory");
After the vertex buffer has been prepared, we can copy the vertex data into the upload buffer we created in section 6. Here we also tell Vulkan to flush the memory to make sure all the writes have gone through before moving onto the next steps where we will transfer the data to the primary allocation.
1.2.
VkMappedMemoryRangeFlushRange =3.
{4.
.sType =VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE,5.
.pNext =nullptr,6.
.memory = UploadBufferMemory,7.
.offset =0,8.
.size =VK_WHOLE_SIZE9.
};10.
11.
memcpy(reinterpret_cast<uint8_t*>(UploadBufferCpuVA), TriangleVertices,sizeof(TriangleVertices));
Assert(vkFlushMappedMemoryRanges(Device,1, &FlushRange) ==VK_SUCCESS,"Failed to flush vertex buffer memory");
When the vertex buffer data is prepared in the upload heap, we can begin preparing the command buffer we will use to transfer the contents from the upload heap
to the primary heap. First we initialize the command buffer with the vkBeginCommandBuffer
call. Then we can insert the copy command using vkCmdCopyBuffer
,
which will transfer the source buffer's contents to the destination buffer. After we are done adding commands to the command buffer, we can finalize the command buffer using
vkEndCommandBuffer
.
1.2.
VkCommandBufferBeginInfoCommandBufferBeginInfo =3.
{4.
.sType =VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO,5.
.pNext =nullptr,6.
.flags =VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT,7.
.pInheritanceInfo =nullptr8.
};9.
10.
VkBufferCopyCopyCmd =11.
{12.
.srcOffset =0,13.
.dstOffset =0,14.
.size =sizeof(TriangleVertices)15.
};16.
17.
Assert(vkBeginCommandBuffer(CommandBuffer, &CommandBufferBeginInfo) ==VK_SUCCESS,"Failed to initialize command buffer");18.
vkCmdCopyBuffer(CommandBuffer, UploadBuffer, VertexBuffer,1, &CopyCmd);
Assert(vkEndCommandBuffer(CommandBuffer) ==VK_SUCCESS,"Failed to finalize command buffer");
After the command buffer has been prepared, we can submit it to the graphics queue we created in sections 4 and 5. We fill out the structure VkSubmitInfo
with the command buffer we prepared, and call
vkQueueSubmit
, along with the fence we created in section 5. After this submission, we wait for the submission to finish by waiting on the fence using vkWaitForFences
.
Note that we have to reset the fence after this, using vkResetFences
, so that it can be used again for future submissions.
1.2.
VkSubmitInfoSubmitInfo =3.
{4.
.sType =VK_STRUCTURE_TYPE_SUBMIT_INFO,5.
.pNext =nullptr,6.
.waitSemaphoreCount =0,7.
.pWaitSemaphores =nullptr,8.
.pWaitDstStageMask =nullptr,9.
.commandBufferCount =1,10.
.pCommandBuffers = &CommandBuffer,11.
.signalSemaphoreCount =0,12.
.pSignalSemaphores =nullptr13.
};14.
15.
Assert(vkQueueSubmit(GraphicsQueue,1, &SubmitInfo, Fence) ==VK_SUCCESS,"Failed to submit command buffer");16.
Assert(vkWaitForFences(Device,1, &Fence,VK_TRUE,1*NANOSECONDS_PER_SECOND) ==VK_SUCCESS,"Fence timeout");
Assert(vkResetFences(Device,1, &Fence) ==VK_SUCCESS,"Could not reset fence");
The next step is to create the vertex and fragment shaders for rendering our triangle.
The vertex shader will be run once per vertex of our triangle. We will take in the vertex position from the vertex buffer in index 0, and the color in index 1. Remember these indices, they will be important for the next section where
we will create the graphics pipeline. The vertex's position will to be written to gl_Position
, which tells Vulkan the coordinate of the vertex, and the color will be sent to
the fragment shader at index 0.
1.#version 450
2.3.
layout(location = 0) in vec3 VertexPosition;
4.layout(location = 1) in vec3 VertexColor;
5.6.
layout(location = 0) out vec3 ColorOut;
7.8.
void main()
9.{
10.gl_Position = vec4(VertexPosition, 1.0);
11.12.
ColorOut = VertexColor;
13.}
The fragment shader will also be simple - it will simply output the color it receives from the vertex shader. Vulkan will automatically interpolate the colors between the vertices.
1.#version 450
2.3.
layout(location = 0) in vec3 ColorIn;
4.5.
layout(location = 0) out vec4 FragColor;
6.7.
void main()
8.{
9.FragColor = vec4(ColorIn, 1.0);
10.}
After the shaders are implemented, we have to compile them into an intermediate representation for Vulkan. This can be done using the glslangvalidator
compiler. There are two options - we can either output a binary file which we can read/load at runtime, or we can produce header files which we can simply include
into our source code and bake the intermediate code into our application. We will be going with the second approach.
glslangvalidator -V --vn VertexShader -S vert VertexShader.vert.glsl -o VertexShader.vert.h
glslangvalidator -V --vn FragmentShader -S frag FragmentShader.frag.glsl -o FragmentShader.frag.h
Flag descriptions:
-V
: tells the compiler to generate a SPIR-V binary--vn
: tells the compiler what we want to name the intermediate representation array's variable in the header file-S
: tells the compiler the type of shader this is (vert
for the vertex shader andfrag
for the fragment shader)-o
: tells the compiler the generated header file's name
Once the intermediate representation headers are generated, we can include those headers and create VkShaderModule
objects for each shader. This will be required for the
graphics pipeline in the next section.
1.2.
#include "VertexShader.vert.h"3.
#include "FragmentShader.frag.h"4.
5.
...6.
7.
VkShaderModuleCreateInfoVertexShaderInfo =8.
{9.
.sType =VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO,10.
.pNext =nullptr,11.
.flags =0,12.
.codeSize =sizeof(VertexShader),13.
.pCode = VertexShader14.
};15.
16.
VkShaderModuleCreateInfoFragmentShaderInfo =17.
{18.
.sType =VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO,19.
.pNext =nullptr,20.
.flags =0,21.
.codeSize =sizeof(FragmentShader),22.
.pCode = FragmentShader23.
};24.
25.
VkShaderModuleVertexShaderModule =nullptr;26.
VkShaderModuleFragmentShaderModule =nullptr;27.
28.
Assert(vkCreateShaderModule(Device, &VertexShaderInfo,nullptr, &VertexShaderModule) ==VK_SUCCESS,"Could not create vertex shader module");
Assert(vkCreateShaderModule(Device, &FragmentShaderInfo,nullptr, &FragmentShaderModule) ==VK_SUCCESS,"Could not create fragment shader module");
Once we have the shaders ready, we have to create the graphics pipeline. The graphics pipeline is a very important object which controls many rendering options/parameters and contains the shaders that will be used for rendering. Before creating the graphics pipeline, we have to create a pipeline layout and also configure several required structures.
The VkPipelineLayout
tells Vulkan the descriptor sets and push constants which will be able to this graphics pipeline. In this example,
we don't have any of those, so we simply create the layout.
1.2.
VkPipelineLayoutCreateInfoPipelineLayoutInfo =3.
{4.
.sType =VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,5.
.pNext =nullptr,6.
.flags =0,7.
.setLayoutCount =0,8.
.pSetLayouts =nullptr,9.
.pushConstantRangeCount =0,10.
.pPushConstantRanges =nullptr11.
};12.
Assert(vkCreatePipelineLayout(Device, &PipelineLayoutInfo,nullptr, &PipelineLayout) ==VK_SUCCESS,"Failed to create pipeline layout");
The first structure we have to configure is VkPipelineShaderStageCreateInfo
. This structure will tell Vulkan the shaders this pipeline will use for rendering. We have to fill out two of these,
one for the vertex shader and one for the fragment shader. The graphics pipeline creation structure will take in an array of this structure, so we define it as an array of size two.
The shader modules we created in the previous section will be used in the module
field; and the pName
field tells Vulkan the entry point/function
of the shader, which is main
in both of our shader implementations.
1.2.
VkPipelineShaderStageCreateInfoPipelineShaderStageInfo[2] =3.
{4.
{5.
.sType =VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,6.
.pNext =nullptr,7.
.flags =0,8.
.stage =VK_SHADER_STAGE_VERTEX_BIT,9.
.module = VertexShaderModule,10.
.pName ="main",11.
.pSpecializationInfo =nullptr12.
},13.
{14.
.sType =VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,15.
.pNext =nullptr,16.
.flags =0,17.
.stage =VK_SHADER_STAGE_FRAGMENT_BIT,18.
.module = FragmentShaderModule,19.
.pName ="main",20.
.pSpecializationInfo =nullptr21.
}
};
Next is the VkPipelineVertexInputStateCreateInfo
structure, which describes the vertex buffers and data format to Vulkan. This structure takes in an array of two other structures -
VkVertexInputBindingDescription
and VkVertexInputAttributeDescription
.
The VkVertexInputBindingDescription
structure describes each vertex buffer which will be bound to this pipeline for rendering. We have all our data
in one buffer, so we only create one. The stride
tells Vulkan how far apart each vertex's data is; in our case its the size of a vertex. The inputRate
specifies whether
the vertex data is per vertex or per instance. The other structure, VkVertexInputAttributeDescription
, will be used to describe each vertex attribute this pipeline will work with.
The location
field tells Vulkan the index of the attribute, recall the vertex shader indices from the previous section. The binding
field specifies
which vertex buffer, from the VkVertexInputBindingDescription
array, it will get this attribute from. The format
just tells Vulkan the format of this attribute,
i.e. the number of components, the number of bits per component, etc. Lastly the offset is used to determine where the first attribute is located in the buffer.
1.2.
VkVertexInputBindingDescriptionBindings[] =3.
{4.
{5.
.binding =0,6.
.stride =sizeof(Vertex),7.
.inputRate =VK_VERTEX_INPUT_RATE_VERTEX8.
}9.
};10.
11.
VkVertexInputAttributeDescriptionAttributes[] =12.
{13.
{14.
.location =0,15.
.binding =0,16.
.format =VK_FORMAT_R32G32B32_SFLOAT,17.
.offset =offsetof(Vertex, position)18.
},19.
{20.
.location =1,21.
.binding =0,22.
.format =VK_FORMAT_R32G32B32_SFLOAT,23.
.offset =offsetof(Vertex, color)24.
}25.
};26.
27.
VkPipelineVertexInputStateCreateInfoPipelineVertexInputStateInfo =28.
{29.
.sType =VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO,30.
.pNext =nullptr,31.
.flags =0,32.
.vertexBindingDescriptionCount =sizeof(Bindings) /sizeof(VkVertexInputBindingDescription),33.
.pVertexBindingDescriptions = Bindings,34.
.vertexAttributeDescriptionCount =sizeof(Attributes) /sizeof(VkVertexInputAttributeDescription),35.
.pVertexAttributeDescriptions = Attributes
};
The next required structure is VkPipelineInputAssemblyStateCreateInfo
, which tells the Vulkan how to assemble the
primitives for rendering. We want our vertex data to be assembled into triangles, so we use VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST
.
1.2.
VkPipelineInputAssemblyStateCreateInfoPipelineInputAssemblyStateInfo =3.
{4.
.sType =VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO,5.
.pNext =nullptr,6.
.flags =0,7.
.topology =VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST,8.
.primitiveRestartEnable =VK_FALSE
};
Next is the VkPipelineViewportStateCreateInfo
structure which tells Vulkan the rendering viewport and the scissor test rectangle.
Vulkan will render into the area given by the viewport rectangle, and discard pixels outside of the scissor test rectangle. For this program Vulkan can render to the entire screen, so we construct both the viewport and scissor test rectangle
to cover the entire screen.
1.2.
VkViewportViewport =3.
{4.
.x =0.0f,5.
.y =0.0f,6.
.width =static_cast<float>(WIDTH),7.
.height =static_cast<float>(HEIGHT),8.
.minDepth =0.0f,9.
.maxDepth =1.0f10.
};11.
12.
VkRect2DScissor =13.
{14.
.offset =15.
{16.
.x =0,17.
.y =018.
},19.
.extent =20.
{21.
.width =WIDTH,22.
.height =HEIGHT23.
}24.
};25.
26.
VkPipelineViewportStateCreateInfoPipelineViewportStateInfo =27.
{28.
.sType =VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO,29.
.pNext =nullptr,30.
.flags =0,31.
.viewportCount =1,32.
.pViewports = &Viewport,33.
.scissorCount =1,34.
.pScissors = &Scissor
};
Next is the VkPipelineRasterizationStateCreateInfo
structure which controls rasterization options. We are interested in
polygonMode
, cullMode
, and frontFace
. The polygonMode
field controls how the rasterizer will render
polygons, in our case we want it to fill them in so we use VK_POLYGON_MODE_FILL
. The frontFace
field tells the rasterizer
which triangles are considered facing "front". We use VK_FRONT_FACE_COUNTER_CLOCKWISE
, which means that triangles with their vertices ordered in
a counter clockwise order will be considered to be facing the front. This is important because we can use an important optimization called
back face culling, which will skip rendering triangles facing away from the viewer. Back face culling can be enabled by setting the cullMode
to VK_CULL_MODE_BACK_BIT
. This won't affect this program since our triangle always faces in front, but eventually when we start rendering 3D models/scenes
it will matter. The rest of the options will be left to the defaults/zero.
1.2.
VkPipelineRasterizationStateCreateInfoPipelineRasterizationStateInfo =3.
{4.
.sType =VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO,5.
.pNext =nullptr,6.
.flags =0,7.
.depthClampEnable =VK_FALSE,8.
.rasterizerDiscardEnable =VK_FALSE,9.
.polygonMode =VK_POLYGON_MODE_FILL,10.
.cullMode =VK_CULL_MODE_BACK_BIT,11.
.frontFace =VK_FRONT_FACE_COUNTER_CLOCKWISE,12.
.depthBiasEnable =VK_FALSE,13.
.depthBiasConstantFactor =0.0f,14.
.depthBiasClamp =0.0f,15.
.depthBiasSlopeFactor =0.0f,16.
.lineWidth =1.0f
};
Next is the VkPipelineMultisampleStateCreateInfo
structure. This structure controls the multisampling options. In this program
we do not need to change anything, we just leave it with the defaults.
1.2.
VkPipelineMultisampleStateCreateInfoPipelineMultisampleStateInfo =3.
{4.
.sType =VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO,5.
.pNext =nullptr,6.
.flags =0,7.
.rasterizationSamples =VK_SAMPLE_COUNT_1_BIT,8.
.sampleShadingEnable =VK_FALSE,9.
.minSampleShading =0.0f,10.
.pSampleMask =nullptr,11.
.alphaToCoverageEnable =VK_FALSE,12.
.alphaToOneEnable =VK_FALSE
};
The next structure is VkPipelineColorBlendStateCreateInfo
which controls the color blending options. This structure requires an array of VkPipelineColorBlendAttachmentState
structures,
and the length of the array should match the number of color attachments which this pipeline will render to. In our case we are only rendering to the framebuffer, so our array will be of size one.
We are not using any blending in this program, so we will leave everything default, except for the colorWriteMask
field of the VkPipelineColorBlendAttachmentState
structure, because
that controls the components that can be written to the color attachment.
1.2.
VkPipelineColorBlendAttachmentStatePipelineColorBlendAttachmentState =3.
{4.
.blendEnable =VK_FALSE,5.
.srcColorBlendFactor =VK_BLEND_FACTOR_ZERO,6.
.dstColorBlendFactor =VK_BLEND_FACTOR_ZERO,7.
.colorBlendOp =VK_BLEND_OP_ADD,8.
.srcAlphaBlendFactor =VK_BLEND_FACTOR_ZERO,9.
.dstAlphaBlendFactor =VK_BLEND_FACTOR_ZERO,10.
.alphaBlendOp =VK_BLEND_OP_ADD,11.
.colorWriteMask =VK_COLOR_COMPONENT_R_BIT|VK_COLOR_COMPONENT_G_BIT|VK_COLOR_COMPONENT_B_BIT|VK_COLOR_COMPONENT_A_BIT12.
};13.
14.
VkPipelineColorBlendStateCreateInfoPipelineColorBlendStateInfo =15.
{16.
.sType =VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO,17.
.pNext =nullptr,18.
.flags =0,19.
.logicOpEnable =VK_FALSE,20.
.logicOp =VK_LOGIC_OP_CLEAR,21.
.attachmentCount =1,22.
.pAttachments = &PipelineColorBlendAttachmentState,23.
.blendConstants = {0.0f,0.0f,0.0f,0.0f}
};
The next structure is VkPipelineDepthStencilStateCreateInfo
, which controls the depth and stencil test options. We are not using the depth or stencil test in this program,
so we leave all the parameters as default.
1.2.
VkPipelineDepthStencilStateCreateInfoDepthStencilInfo =3.
{4.
.sType =VK_STRUCTURE_TYPE_PIPELINE_DEPTH_STENCIL_STATE_CREATE_INFO,5.
.pNext =nullptr,6.
.flags =0,7.
.depthTestEnable =VK_FALSE,8.
.depthWriteEnable =VK_FALSE,9.
.depthCompareOp =VK_COMPARE_OP_NEVER,10.
.depthBoundsTestEnable =VK_FALSE,11.
.stencilTestEnable =VK_FALSE,12.
.front =13.
{14.
.failOp =VK_STENCIL_OP_KEEP,15.
.passOp =VK_STENCIL_OP_KEEP,16.
.depthFailOp =VK_STENCIL_OP_KEEP,17.
.compareOp =VK_COMPARE_OP_NEVER,18.
.compareMask =0,19.
.writeMask =0,20.
.reference =021.
},22.
.back =23.
{24.
.failOp =VK_STENCIL_OP_KEEP,25.
.passOp =VK_STENCIL_OP_KEEP,26.
.depthFailOp =VK_STENCIL_OP_KEEP,27.
.compareOp =VK_COMPARE_OP_NEVER,28.
.compareMask =0,29.
.writeMask =0,30.
.reference =031.
},32.
.minDepthBounds =0.0f,33.
.maxDepthBounds =0.0f
};
Once we have all these structures prepared, we can create the graphics pipeline.
1.2.
VkGraphicsPipelineCreateInfoGraphicsPipelineInfo =3.
{4.
.sType =VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO,5.
.pNext =nullptr,6.
.flags =0,7.
.stageCount =2,8.
.pStages = PipelineShaderStageInfo,9.
.pVertexInputState = &PipelineVertexInputStateInfo,10.
.pInputAssemblyState = &PipelineInputAssemblyStateInfo,11.
.pTessellationState =nullptr,12.
.pViewportState = &PipelineViewportStateInfo,13.
.pRasterizationState = &PipelineRasterizationStateInfo,14.
.pMultisampleState = &PipelineMultisampleStateInfo,15.
.pDepthStencilState = &DepthStencilInfo,16.
.pColorBlendState = &PipelineColorBlendStateInfo,17.
.pDynamicState =nullptr,18.
.layout = PipelineLayout,19.
.renderPass = RenderPass,20.
.subpass =0,21.
.basePipelineHandle =nullptr,22.
.basePipelineIndex = −123.
};24.
Assert(vkCreateGraphicsPipelines(Device,nullptr,1, &GraphicsPipelineInfo,nullptr, &GraphicsPipeline) ==VK_SUCCESS,"Failed to create graphics pipeline");
Once the graphics pipeline is created, we can free the shader modules.
1.2.
if (VertexShaderModule !=nullptr)3.
{4.
vkDestroyShaderModule(Device, VertexShaderModule,nullptr);5.
VertexShaderModule =nullptr;6.
}7.
8.
if (FragmentShaderModule !=nullptr)9.
{10.
vkDestroyShaderModule(Device, FragmentShaderModule,nullptr);11.
FragmentShaderModule =nullptr;
}
Once we have the vertex buffer and graphics pipeline set up, we can implement the actual rendering logic! The main loop will call this code every frame to render the triangle, until the user closes the program.
The first step is to get the next swapchain image using vkAcquireNextImageKHR
. This will give us the index of the next available image. We use the AcquireSemaphore
from section 7 here. When the image is available/ready to be rendered to, this semaphore will be signalled. When we submit the work to the graphics queue, the graphics queue will wait for
this semaphore to become signalled before beginning rendering.
1.2.
uint32_tSwapchainIndex =0;
Assert(vkAcquireNextImageKHR(Device, Swapchain,UINT64_MAX, AcquireSemaphore,nullptr, &SwapchainIndex) ==VK_SUCCESS,"Could not get next surface image");
After we know which framebuffer/swapchain image we will be rendering to, we begin preparing the command buffer which will contain all our rendering commands.
1.2.
VkCommandBufferBeginInfoCommandBufferBeginInfo =3.
{4.
.sType =VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO,5.
.pNext =nullptr,6.
.flags =VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT,7.
.pInheritanceInfo =nullptr8.
};9.
Assert(vkBeginCommandBuffer(CommandBuffer, &CommandBufferBeginInfo) ==VK_SUCCESS,"Failed to initialize command buffer");
The first command we insert into the command buffer is to set up the render pass. This will prepare the framebuffer which we are rendering to. Here we use the render pass object we created earlier and the current framebuffer, and we also specify the rendering rectangle (the whole screen) and the clear color (black in this case).
1.2.
// Color buffer clear color3.
VkClearValueClearColor;4.
ClearColor.color.float32[0] =0.00f;5.
ClearColor.color.float32[1] =0.00f;6.
ClearColor.color.float32[2] =0.45f;7.
ClearColor.color.float32[3] =0.00f;8.
9.
VkRect2DRenderArea =10.
{11.
.offset =12.
{13.
.x =0,14.
.y =015.
},16.
.extent =17.
{18.
.width =WIDTH,19.
.height =HEIGHT20.
}21.
};22.
23.
VkRenderPassBeginInfoRenderPassBeginInfo =24.
{25.
.sType =VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,26.
.pNext =nullptr,27.
.renderPass = RenderPass,28.
.framebuffer = Framebuffers[SwapchainIndex],29.
.renderArea = RenderArea,30.
.clearValueCount =1,31.
.pClearValues = &ClearColor32.
};33.
Assert(vkBeginCommandBuffer(CommandBuffer, &CommandBufferBeginInfo) ==VK_SUCCESS,"Failed to initialize command buffer");
After the framebuffer is prepared, we bind our graphics pipeline.
1.vkCmdBindPipeline(CommandBuffer, VK_PIPELINE_BIND_POINT_GRAPHICS, GraphicsPipeline);
Next we bind the vertex buffer, and tell Vulkan to render the triangle.
1.2.
uint64_tpOffsets[1] = {0};3.
VkBufferpBuffers[1] = { VertexBuffer };4.
constuint32_tVertexCount =sizeof(TriangleVertices) /sizeof(Vertex);5.
6.
vkCmdBindVertexBuffers(CommandBuffer,0,1, pBuffers, pOffsets);
vkCmdDraw(CommandBuffer, VertexCount,1,0,0);
After we are done with our rendering commands, we can end the render pass, and finalize the command buffer.
1.2.
vkCmdEndRenderPass(CommandBuffer);3.
Assert(vkEndCommandBuffer(CommandBuffer) ==VK_SUCCESS,"Failed to finalize command buffer");
Once the command buffer is populated, we have to submit it to our graphics queue. The VkSubmitInfo
structure will specify the command buffer we are submitting, and it
will also specify the semaphores which the queue should wait on before beginning rendering. As mentioned before, we want the graphics queue to wait for the framebuffer to become
available before it starts running, so we use our AcquireSemaphore
from earlier here. We want the graphics queue to wait at the top of the pipe, meaning before any of the shader stages begin, so we use VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT
as the waiting stage. Note that we also give a semaphore to signal once the command buffer is finished executing, this will be useful to know when the image can be presented. For this we will use our second semaphore, the ReleaseSemaphore
.
1.2.
VkPipelineStageFlagsWaitDstStageMasks[] = {VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT};3.
4.
VkSubmitInfoSubmissionInfo =5.
{6.
.sType =VK_STRUCTURE_TYPE_SUBMIT_INFO,7.
.pNext =nullptr,8.
.waitSemaphoreCount =1,9.
.pWaitSemaphores = &AcquireSemaphore,10.
.pWaitDstStageMask = WaitDstStageMasks,11.
.commandBufferCount =1,12.
.pCommandBuffers = &CommandBuffer,13.
.signalSemaphoreCount =1,14.
.pSignalSemaphores = &ReleaseSemaphore15.
};16.
Assert(vkQueueSubmit(GraphicsQueue,1, &SubmissionInfo, Fence) ==VK_SUCCESS,"Failed to submit command buffer");
At this point, the command buffer has been submitted to the queue, and we can ask Vulkan to present the new image when its available. The ReleaseSemaphore
will be signalled
once the rendering operations finish. We will out the structure VkPresentInfoKHR
with the swapchain, current swapchain index, and the ReleaseSemaphore
, and
call vkQueuePresentKHR
.
1.2.
VkPresentInfoKHRPresentInfo =3.
{4.
.sType =VK_STRUCTURE_TYPE_PRESENT_INFO_KHR,5.
.pNext =nullptr,6.
.waitSemaphoreCount =1,7.
.pWaitSemaphores = &ReleaseSemaphore,8.
.swapchainCount =1,9.
.pSwapchains = &Swapchain,10.
.pImageIndices = &SwapchainIndex,11.
.pResults =nullptr12.
};13.
Assert(vkQueuePresentKHR(GraphicsQueue, &PresentInfo) ==VK_SUCCESS,"Failed to present");
The semaphores we used above will insert the waits into the command buffer and make the GPU queue wait, but we also need to make sure the CPU is syncronized.
We will accomplish this by waiting on the vkQueueSubmit
fence. Once this fence comes back, it will mean that the previous rendering operation is finished and we can
begin preparing the next frame. Note that this does not mean presentation has also finished, but that is ok because the next frame will be rendered to the next swapchain image.
Even if we fill all the available swapchain images, eventually the queue will have to wait for the next swapchain image to become available, and the wait on the fence will block the CPU.
1.2.
Assert(vkWaitForFences(Device,1, &Fence,VK_TRUE,1*NANOSECONDS_PER_SECOND) ==VK_SUCCESS,"Fence timeout");
Assert(vkResetFences(Device,1, &Fence) ==VK_SUCCESS,"Could not reset fence");
Once you have made it here, we can finally see our precious triangle :) !
When the application is closed, we have to free all our allocations/objects otherwise we will leak the memory. For that please see the destructor
~HelloTriangle
in the source code.