AOBaker
AOBaker

2018-2019 c++ directx11 raytracing

AOBaker is my tool for baking ambient occlusion textures.

Background

When I was messing around with modeling LEGO bricks, I wanted to generate ambient occlusion textures for them. I looked around for free/open-source software to do the job, and found a few that looked promising at first, but each tool I tried either didn’t generate output that looked the way I wanted it to, had a frustrating UI, or both.

So, with DirectX Raytracing having just launched, I started writing AOBaker. I was writing for DirectX 12 for the first time, which is what DXR requires, so I took some extra time to learn the basics, but it really didn’t appeal to me. When it was announced that AMD’s video cards wouldn’t support DXR for a long time, and that the DXR fallback layer was being discontinued, I decided to ditch DXR and turn to something that would work immediately on the mid-range AMD card I had at the time.

I found the RadeonRays project instead, which fit the bill exactly. With no dependency on DirectX 12, I was free to port everything back to DirectX 11, and my productivity and sanity increased tenfold. Now that I have the baking part of the project just about complete, I am turning my attention towards improving the Dear Imgui-based UI.

Implementation

My AO baking implementation (contained entirely in BakeEngine.cpp) is an odd convergence of RadeonRays (OpenCL) and D3D11.

The first step in the baking process for a texture is to determine which rays need to be cast from which positions on behalf of which texels. I use RasterizeShader with D3D11 to render each mesh section that has the target texture, but using its UV coordinates rather than its position coordinates. The RasterizeShader writes each pixel’s world-space position, world-space normal, and UV coordinates to the PositionUBuffer and NormalVBuffer. Once the mesh section has been rendered, we copy the resulting pixel data back to the CPU.

However, pixels on the AO texture we’d like to bake can be covered by an arbitrary number of triangles in the mesh section, and we’d like to average the resulting occlusion value across each usage of the pixel. So, we need a way to rasterize each usage of a pixel exactly once. I make use of the stencil buffer to do just that by setting the stencil operation to increment the stencil value on render. This way, I can rasterize the mesh section n times, and tell the output merger stage to reject pixels with a stencil value greater than the current loop iteration. Then, when I read the iteration’s output pixel data, I only count pixels whose stencil value is equal to the loop iteration. For example, on iteration 4, pixels whose stencil value is already 4 cannot be overwritten by later pixels, and when we read the pixel data on the CPU, we ignore pixels whose stencil value is less than 4, so that they are not counted multiple times and so that empty pixels are not interpreted as being live.

I imagine this process like peeling an onion backwards - each iteration of the loop results in one more ‘onion layer’ being visible in the rendered image.

Once we have a layer’s rasterized data, we need to generate the rays that will be traced to approximate occlusion. This is done using the GenerateRaysComputeShader, and the results are written out in RadeonRays’ struct ray format into RayOutputBuffer.

Next, we read the rays back onto the CPU and pass them to RadeonRays for tracing. This is pretty much the easiest step.

Then, we use ProcessIntersectionsComputeShader to add the results from the ray intersection tests to the running totals for the texture pixels.

After each of the ‘onion layers’ for a texture have been summed into the pixel totals (ResultBuffer), we use PostprocessShader to average the occlusion value and determine a final color for the baked pixel.