Compute Shader in Unreal [Tutorial]

Posted by admin | 6. Oktober 2019 | Allgemein, Blog, Downloads, Tutorial

Working with Compute Shader in Unreal is complicated and annoying: One finds rarely information on the web on how to implement, use or include them in Unreal – which is why I decided to write a short tutorial on this topic that should cover the very basics of Compute Shader in Unreal.

Since I assume that the readers of my blog are insanely intelligent and since I want to write a short and straight-to-the-point tutorial, this tutorial will not cover what an Compute Shader is or how it works, merely how to include and use it inside Unreal. The tutorial is based on and sometimes will refer to my Compute Shader plugin on GitHub: https://github.com/ValentinKraft/UE4_SortingComputeShader (which was created in the context of my master thesis to efficiently sort point clouds in parallel on the GPU and which is again based on the project of Temaran, see: https://github.com/Temaran/UE4ShaderPluginDemo).

So, let’s begin!


Declaring and Setting up the shader for Unreal

So – let’s start! First, I would recommend to use my Compute Shader plugin (see link above) as a starting point for your project, since Compute Shader require a lot of boilerplate code. I would also recommend to encapsulate your Compute Shader in a plugin to clearly separate it from the rest of your code. Okay – let’s have a look of the structure of the plugin:

As you can see, the actual HLSL shaders are inside the Shader folder and the source files are – no big surprise – in the Source folder. In my case, the FComputeShaderDeclaration class defines the Compute Shader itself and declares the parameters, variables and so on. The ComputeShaderUsageExample class then executes the Compute Shader and provides the interface so that you can invoke your Shader from your Game/Main code. Let’s start with the FComputeShaderDeclaration class. First, you have to define what kind of variables your Shader should have as a input/output. In my case, I want the Compute Shader some textures that it will work on, including an output texture and some constant and dynamic parameters. The constant and dynamic parameters (dynamic means that the value of the parameter might change during execution) can be defined as such in the ComputeShaderDeclaration.h file:

//This buffer should contain variables that never, or rarely change
BEGIN_UNIFORM_BUFFER_STRUCT(FComputeShaderConstantParameters, )
UNIFORM_MEMBER(float, SimulationSpeed)
END_UNIFORM_BUFFER_STRUCT(FComputeShaderConstantParameters)

//This buffer is for variables that change very often (each frame for example)
BEGIN_UNIFORM_BUFFER_STRUCT(FComputeShaderVariableParameters, )
UNIFORM_MEMBER(FVector4, CurrentCamPos)
UNIFORM_MEMBER(int, g_iLevel)
UNIFORM_MEMBER(int, g_iLevelMask)
UNIFORM_MEMBER(int, g_iWidth)
UNIFORM_MEMBER(int, g_iHeight)
END_UNIFORM_BUFFER_STRUCT(FComputeShaderVariableParameters)

typedef TUniformBufferRef FComputeShaderConstantParametersRef;
typedef TUniformBufferRef FComputeShaderVariableParametersRef;

Now we have to create a class for our shader that inherits from the FShader class:

/***************************************************************************/
/* This class is what encapsulates the shader in the engine.               */
/* It is the main bridge between the HLSL located in the engine directory  */
/* and the engine itself.                                                  */
/***************************************************************************/
class FComputeShaderDeclaration : public FGlobalShader
{
	DECLARE_SHADER_TYPE(FComputeShaderDeclaration, Global);

public:

	FComputeShaderDeclaration() {}

	explicit FComputeShaderDeclaration(const ShaderMetaType::CompiledShaderInitializerType& Initializer);

	static bool ShouldCompilePermutation(const FGlobalShaderPermutationParameters& Parameters) {
		return GetMaxSupportedFeatureLevel(Parameters.Platform) >= ERHIFeatureLevel::SM5;
	};

	static void ModifyCompilationEnvironment(const FGlobalShaderPermutationParameters& Parameters, FShaderCompilerEnvironment& OutEnvironment);

	virtual bool Serialize(FArchive& Ar) override
	{
		bool bShaderHasOutdatedParams = FGlobalShader::Serialize(Ar);

		Ar << OutputTexture;
		Ar << OutputColorTexture;
		Ar << PointPosData;
		Ar << PointPosDataBuffer;
		Ar << PointColorData;
		Ar << PointColorDataBuffer;

		return bShaderHasOutdatedParams;
	}

	// Sets the main output texture UAV (the point position texture)
	void SetOutputTexture(FRHICommandList& RHICmdList, FUnorderedAccessViewRHIRef OutputTextureUAV);
	// This function is required to bind our constant / uniform buffers to the shader.
	void SetUniformBuffers(FRHICommandList& RHICmdList, FComputeShaderConstantParameters& ConstantParameters, FComputeShaderVariableParameters& VariableParameters);
	// This is used to clean up the buffer binds after each invocation to let them be changed and used elsewhere if needed.
	void UnbindBuffers(FRHICommandList& RHICmdList);

	// Sets the unsorted point position input data
	void SetPointPosData(FRHICommandList& RHICmdList, FUnorderedAccessViewRHIRef BufferUAV, FUnorderedAccessViewRHIRef BufferUAV2);
	// Sets the unsorted point color input data
	void SetPointColorData(FRHICommandList& RHICmdList, FUnorderedAccessViewRHIRef BufferUAV, FUnorderedAccessViewRHIRef BufferUAV2);
	// Sets the output texture for the sorted point colors
	void SetPointColorTexture(FRHICommandList& RHICmdList, FUnorderedAccessViewRHIRef BufferUAV);

private:
	//This is the actual output resource that we will bind to the compute shader
	FShaderResourceParameter OutputTexture;
	FShaderResourceParameter OutputColorTexture;
	FShaderResourceParameter PointPosData;
	FShaderResourceParameter PointPosDataBuffer;
	FShaderResourceParameter PointColorData;
	FShaderResourceParameter PointColorDataBuffer;
};

All the buffers or textures you want to have in your shader, have to be declared here. In my case, I declared the following textures and buffers:

  • OutputTexture;
  • OutputColorTexture;
  • PointPosData;
  • PointPosDataBuffer;
  • PointColorData;
  • PointColorDataBuffer;

The other methods (like e.g. SetPointPosData) are for the setting of the textures during runtime and we'll use them later. Now let's take a look at the cpp file. Here we have to bind the parameters we have created to the Shader, to make them available inside it:

//These are needed to actually implement the constant buffers so they are available inside our shader
//They also need to be unique over the entire solution since they can in fact be accessed from any shader
IMPLEMENT_UNIFORM_BUFFER_STRUCT(FComputeShaderConstantParameters, TEXT("CSConstants"))
IMPLEMENT_UNIFORM_BUFFER_STRUCT(FComputeShaderVariableParameters, TEXT("CSVariables"))

FComputeShaderDeclaration::FComputeShaderDeclaration(const ShaderMetaType::CompiledShaderInitializerType& Initializer)
: FGlobalShader(Initializer)
{
	//This call is what lets the shader system know that the surface OutputTexture is going to be available in the shader. The second parameter is the name it will be known by in the shader
	OutputTexture.Bind(Initializer.ParameterMap, TEXT("OutputTexture"));
	PointPosData.Bind(Initializer.ParameterMap, TEXT("PointPosData"));
	PointPosDataBuffer.Bind(Initializer.ParameterMap, TEXT("PointPosDataBuffer"));
	OutputColorTexture.Bind(Initializer.ParameterMap, TEXT("OutputColorTexture"));
	PointColorData.Bind(Initializer.ParameterMap, TEXT("PointColorData"));
	PointColorDataBuffer.Bind(Initializer.ParameterMap, TEXT("PointColorDataBuffer"));
}

In addition to that, the methods that set the textures and buffers inside the shader have to be implemented. The code is mostly boilerplate and can be copied for each texture/buffer you want to use:

void FComputeShaderDeclaration::SetPointColorTexture(FRHICommandList& RHICmdList, FUnorderedAccessViewRHIRef BufferUAV) {

	FComputeShaderRHIParamRef ComputeShaderRHI = GetComputeShader();

	if (OutputColorTexture.IsBound())
		RHICmdList.SetUAVParameter(ComputeShaderRHI, OutputColorTexture.GetBaseIndex(), BufferUAV);
}

Finally, we also have to hand over our constant and dynamic parameters as buffers to the Shader and define a method that unbinds the Buffers again:

void FComputeShaderDeclaration::SetUniformBuffers(FRHICommandList& RHICmdList, FComputeShaderConstantParameters& ConstantParameters, FComputeShaderVariableParameters& VariableParameters)
{
	FComputeShaderConstantParametersRef ConstantParametersBuffer;
	FComputeShaderVariableParametersRef VariableParametersBuffer;

	ConstantParametersBuffer = FComputeShaderConstantParametersRef::CreateUniformBufferImmediate(ConstantParameters, UniformBuffer_SingleDraw);
	VariableParametersBuffer = FComputeShaderVariableParametersRef::CreateUniformBufferImmediate(VariableParameters, UniformBuffer_SingleDraw);

	SetUniformBufferParameter(RHICmdList, GetComputeShader(), GetUniformBufferParameter(), ConstantParametersBuffer);
	SetUniformBufferParameter(RHICmdList, GetComputeShader(), GetUniformBufferParameter(), VariableParametersBuffer);
}

/* Unbinds buffers that will be used elsewhere */
void FComputeShaderDeclaration::UnbindBuffers(FRHICommandList& RHICmdList)
{
	FComputeShaderRHIParamRef ComputeShaderRHI = GetComputeShader();

	if (OutputTexture.IsBound())
		RHICmdList.SetUAVParameter(ComputeShaderRHI, OutputTexture.GetBaseIndex(), FUnorderedAccessViewRHIRef());
	if (OutputColorTexture.IsBound())
		RHICmdList.SetUAVParameter(ComputeShaderRHI, OutputColorTexture.GetBaseIndex(), FUnorderedAccessViewRHIRef());
	if (PointPosData.IsBound())
		RHICmdList.SetShaderResourceViewParameter(ComputeShaderRHI, PointPosData.GetBaseIndex(), FShaderResourceViewRHIParamRef());
	if (PointColorData.IsBound())
		RHICmdList.SetUAVParameter(ComputeShaderRHI, PointColorData.GetBaseIndex(), FUnorderedAccessViewRHIRef());
}

Finally, we have to tell Unreal where the Shader code (the .usf file) is located using the IMPLEMENT_SHADER_TYPE macro. We also have to specify the name of the function in the shader that we want to call (in my case "MainComputeShader") and the type of the shader (a Compute Shader):

IMPLEMENT_SHADER_TYPE(, FComputeShaderDeclaration, TEXT("/ComputeShaderPlugin/BitonicSortingKernelComputeShader.usf"), TEXT("MainComputeShader"), SF_Compute);

The Shader code

Now, lets have a short look on the shader. First, we will define the textures and buffers we declared inside our ComputeShaderDeclaration class:

RWTexture2D OutputTexture : register(u0);               // Point Positions Output UAV Texture
RWTexture2D OutputColorTexture : register(u3);          // Point Colors Output UAV Texture
RWStructuredBuffer PointPosData : register(u1);         // Point Positions Input Buffer
RWStructuredBuffer PointColorData : register(u4);       // Point Colors Input Buffer
RWStructuredBuffer PointPosDataBuffer : register(u2);
RWStructuredBuffer PointColorDataBuffer : register(u5);

Now, to access the Constant and Variable Parameters we have declared as well we can just write it as such:

float3 camPos = CSVariables.CurrentCamPos;
float constant = CSConstants.constant;

You can now write your shader inside the MainComputeShader function:

[numthreads(BITONIC_BLOCK_SIZE, 1, 1)]
void MainComputeShader(uint3 Gid : SV_GroupID,             //atm: -, 0...256, - in rows (Y)        --> current group index (dispatched by c++)
                       uint3 DTid : SV_DispatchThreadID,   //atm: 0...256 in rows & columns (XY)   --> "global" thread id
                       uint3 GTid : SV_GroupThreadID,      //atm: 0...256, -,- in columns (X)      --> current threadId in group / "local" threadId
                       uint GI : SV_GroupIndex)            //atm: 0...256 in columns (X)           --> "flattened" index of a thread within a group
{
//...
}

Executing the Compute Shader

Now that we have declared our shader, we can start to write the code for executing the Shader. This is done in the ComputeShaderUsageExample file, in which we declare a FComputeShader class. In the constructor of the class, we will generate the required textures, buffers and variables:

FComputeShader::FComputeShader(float SimulationSpeed, int32 SizeX, int32 SizeY, ERHIFeatureLevel::Type ShaderFeatureLevel)
{
	FeatureLevel = ShaderFeatureLevel;
	ConstantParameters.SimulationSpeed = SimulationSpeed;
	VariableParameters = FComputeShaderVariableParameters();

	bIsComputeShaderExecuting = false;
	bIsUnloading = false;
	bSave = false;

	// Create textures
	FRHIResourceCreateInfo CreateInfo;
	m_SortedPointPosTex = RHICreateTexture2D(SizeX, SizeY, PF_A32B32G32R32F, 1, 1, TexCreate_ShaderResource | TexCreate_UAV, CreateInfo);
	m_SortedPointPosTex_UAV = RHICreateUnorderedAccessView(m_SortedPointPosTex);

	m_SortedPointColorsTex = RHICreateTexture2D(SizeX, SizeY, PF_A32B32G32R32F, 1, 1, TexCreate_ShaderResource | TexCreate_UAV, CreateInfo);
	m_SortedPointColorsTex_UAV = RHICreateUnorderedAccessView(m_SortedPointColorsTex);

	// Initialise data buffers with invalid values
	PointPosData.Init(ZeroVector, NUM_ELEMENTS);
	PointColorData.Init(FVector4(0.0f, 1.0f, 0.0f, 0.0f), NUM_ELEMENTS);

	// Create UAVs for point position buffer
	CreateInfo.ResourceArray = &PointPosData;
	m_PointPosDataBuffer = RHICreateStructuredBuffer(sizeof(float) * 4, sizeof(float) * 4 * NUM_ELEMENTS, BUF_UnorderedAccess | BUF_ShaderResource, CreateInfo);
	m_PointPosDataBuffer_UAV = RHICreateUnorderedAccessView(m_PointPosDataBuffer, false, false);
	m_PointPosDataBuffer_UAV2 = RHICreateUnorderedAccessView(m_PointPosDataBuffer, false, false);

	// Create UAVs for point colors buffer
	CreateInfo.ResourceArray = &PointColorData;
	m_PointColorsDataBuffer = RHICreateStructuredBuffer(sizeof(float) * 4, sizeof(float) * 4 * NUM_ELEMENTS, BUF_UnorderedAccess | BUF_ShaderResource, CreateInfo);
	m_PointColorsDataBuffer_UAV = RHICreateUnorderedAccessView(m_PointColorsDataBuffer, false, false);
	m_PointColorsDataBuffer_UAV2 = RHICreateUnorderedAccessView(m_PointColorsDataBuffer, false, false);
}

Now, it is finally time to execute the Compute Shader. We have to keep in mind that the shader has to be executed inside Unreal's Render thread. Therefore, we declare two methods, ExecuteComputeShader and ExecuteComputeShaderInternal. The latter should only be called from the render thread. Therefore, we implement inside the ExecuteComputeShader method a macro which ensures that the code is called inside the rendering thread:

void FComputeShader::ExecuteComputeShader(FVector4 currentCamPos)
{
	if (bIsUnloading || bIsComputeShaderExecuting) //Skip this execution round if we are already executing
		return;

	bIsComputeShaderExecuting = true;

	//Now set our runtime parameters!
	VariableParameters.CurrentCamPos = currentCamPos;

	//This macro sends the function we declare inside to be run on the render thread. What we do is essentially just send this class and tell the render thread to run the internal render function as soon as it can.
	//I am still not 100% Certain on the thread safety of this, if you are getting crashes, depending on how advanced code you have in the start of the ExecutePixelShader function, you might have to use a lock :)
	ENQUEUE_UNIQUE_RENDER_COMMAND_ONEPARAMETER(
		FComputeShaderRunner,
		FComputeShader*, ComputeShader, this,
		{
		ComputeShader->ExecuteComputeShaderInternal();
	}
	);
}

The execution of the Compute Shader will now take place inside this ExecuteComputeShaderInternal function:

void FComputeShader::ExecuteComputeShaderInternal(){
    /* Get global RHI command list */
	FRHICommandListImmediate& RHICmdList = GRHICommandList.GetImmediateCommandList();
	//* Create Compute Shader */
	TShaderMapRef ComputeShader(GetGlobalShaderMap(FeatureLevel));

//...
}

We can now call the former defined functions to pass data to the shader:

	//* Pass input data to shader */
	ComputeShader->SetPointPosData(RHICmdList, m_PointPosDataBuffer_UAV, m_PointPosDataBuffer_UAV2);
	ComputeShader->SetPointColorData(RHICmdList, m_PointColorsDataBuffer_UAV, m_PointColorsDataBuffer_UAV2);

And we finally execute the Compute Shader as follows:

		RHICmdList.SetComputeShader(ComputeShader->GetComputeShader());
		DispatchComputeShader(RHICmdList, *ComputeShader, 1, NUM_ELEMENTS / BITONIC_BLOCK_SIZE, 1);

In the end we have to unbind the buffers again:

ComputeShader->UnbindBuffers(RHICmdList);

And hey - we did it! We can finally create and execute the Compute Shader in our standard Unreal code:

mComputeShader = new FComputeShader(1.0f, BITONIC_BLOCK_SIZE, BITONIC_BLOCK_SIZE, currentWorld->Scene->GetFeatureLevel());
mComputeShader->ExecuteComputeShader(FVector4(currentCamPos));

Easy, huh? :'D

Add a comment

*Please complete all fields correctly

Related Blogs

Posted by admin | 28 August 2019
Sometimes, you want to share a Texture that you have created in one application to another application on the same machine. When performance is important, the DirectX Texture Sharing feature…
Posted by admin | 15 Juli 2019
It is well-known that building WebRTC from source can be a quite painful process because the WebRTC library has many dependecies and a very complex build pipeline. In a recent…
Posted by admin | 16 Juli 2018
Download the Plugin for the Unreal Engine here: https://github.com/ValentinKraft/UE4_SortingComputeShader The compute shader that handles the sorting: //Since we can't #include private Engine shaders such as Common.ush we have to copy…