This tutorial shows how to record command lists in parallel from multiple threads.
This tutorial generates the same output as Tutorial05, but renders every cube using individual draw call. It shows how recording commands can be split between multiple threads. Note that this tutorial illustrates the API usage and for this specific rendering problem, instancing is a more efficient solution. However, multithreading in a real application can be implemented in the same way as shown in this tutorial.
This tutorial uses shaders from Tutorial03. While pixel shader is exactly the same, the vertex shader applies rotation and instance-specific transformation before the global view-projection transform. The instance transform matrix resides in its own constant buffer that is updated every time a new instance is rendered.
Pipeline state, shaders, vertex and index buffers are initialized in the same way as in previous tutorials. What is different is that this time we load every texture individually, and then bind the texture to its own shader resource binding object:
This example illustrates the expected usage of mutable shader resources: the app creates several SRB objects encompassing different resource bindings.
This tutorial explicitly transitions all resources to required states using IDeviceContext::TransitionResourceStates()
method. The method takes an array of StateTransitionDesc
structures. The structure defines resource to transition, as well old state and new states. Old state can be set to RESOURCE_STATE_UNKNOWN
in which case the engine will use the internal resource state. For a texture, the structure also defines the range of array slices and mip levels to transition. For example, transitioning vertex and index buffers to required states can be performed as follows:
When resources are explicitly transitioned to correct states, the engine does not need to check the states at every draw command which greatly reduces the overhead.
All rendering commands in Diligent Engine are issued through device contexts. Similar to Direct3D11, there are two types of contexts: immediate and deferred. An immediate context records rendering commands and implicitly submits them for execution. Deferred contexts can only record commands to a command list that can later be executed through the immediate context. Deferred contexts should be created for every worker thread that records rendering commands.
Main thread coordinates the execution of worker threads and handles recorded command lists. It starts by signaling all worker threads to start:
and renders its own subset:
It then waits until worker threads signal that all command lists are ready and executes them:
Finally, it tells the worker threads to proceed to the next frame:
Every worker thread starts by waiting for the signal from the main thread (a negative value is an exit signal):
The thread then renders the allotted subset using its own deferred context:
When all commands are recorded, a command list is requested from the deferred context that is later executed by the main thread:
When all threads are done recording the commands, the last thread signals the main thread that it can start executing the command lists. The threads then wait for the signal from the main thread to proceed to the next frame. After the signal is received, every thread calls FinishFrame() to release all dynamic resources allocated by its deferred context. This must be done after the command lists have been submitted for execution.
Subset rendering procedure is generally the same as in previous tutorials. Few details are worth mentioning.
Note that render targets are set and transitioned to correct states by the main thread, so we use RESOURCE_STATE_TRANSITION_MODE_VERIFY
flag to double-check the states are correct.