DirectX 11 and WinUI 3.0

Prepping For a Camera

Let's continue with our look at how to render stuff with DirectX 11 inside a WinUI 3.0 window.

Wire-Wrapping the Model

The solid red color is a bit difficult at this stage. I'd much rather see the individual vertices at this stage. We can achieve this with wireframe mode. First, let's define a rasterizer state and a depth stencil state. They are configured to control how the graphics pipeline handles rendering. They'll hold the references to the created states, allowing the application to use and manipulate them as needed for rendering operations.

ID3D11RasterizerState is used to store a state that determines how geometry is rendered, including settings for culling, polygon filling, and depth clipping. Culling is a process where objects or parts of objects that are not visible to the camera are not rendered. It helps improve performance by skipping the rendering of unnecessary elements, making the scene look more realistic and efficient. Polygon filling refers to coloring or shading the interior of a closed shape, like a triangle or rectangle, to give it a solid appearance by determining which pixels or fragments within the shape should be colored based on their positions and properties. Depth clipping determines which parts of a 3D object should be visible or hidden based on their distance from the viewer. It ensures that only the parts within the viewing range are displayed, while the rest is clipped or removed from the scene.

C#
private ID3D11RasterizerState rasterizerState;

public void CreateResources()
{
	...
	RasterizerDescription rasterizerStateDescription = 
		new RasterizerDescription(CullMode.Back, FillMode.Wireframe)
	{
		FrontCounterClockwise = true,
		DepthBias = 0,
		DepthBiasClamp = 0f,
		SlopeScaledDepthBias = 0f,
		DepthClipEnable = true,
		ScissorEnable = false,
		MultisampleEnable = true,
		AntialiasedLineEnable = false
	};
	rasterizerState = device.CreateRasterizerState(rasterizerStateDescription);
}

public void SetRenderState()
{
	...
	deviceContext.RSSetState(rasterizerState);
	rasterizerState.Dispose();
}

In this specific example, the RasterizerDescription is being initialized with various properties:

  • CullMode.Back specifies that triangles with a clockwise winding order (when viewed from the camera) should be culled or not drawn. This means that only the front-facing triangles will be rendered.
  • FillMode.Wireframe specifies that the triangles should be rendered as wireframes or outlines instead of solid surfaces. This allows you to see the edges of the triangles. So far, we've been using Solid instead.
  • FrontCounterClockwise specifies that the front-facing triangles should have a clockwise winding order. This determines which side of the triangles is considered the front. Since Blender uses CCW winding order for defining the front-facing triangles by default, we set this to true.
  • DepthBias, DepthBiasClamp, SlopeScaledDepthBias properties control depth bias, which is used to reduce visual artifacts when rendering overlapping triangles. They help address the issue of "z-fighting" where objects at similar depths can appear flickering or fighting for visibility. Let's leave them off (0f) for now.
  • DepthClipEnable specifies that depth clipping should be enabled. This ensures that objects are properly clipped or not rendered when they are outside the view frustum (visible part of the scene) or beyond the visible depth range.
  • ScissorEnable specifies that scissor testing should be disabled. Scissor testing is used to restrict rendering to a specific rectangular area of the screen. By enabling scissor testing and defining a rectangular region, we could control which part of the screen is affected by rendering operations. Only pixels within the defined region will be considered for rendering, while pixels outside the region will be ignored.
  • MultisampleEnable: Multisampling is a technique used to reduce jagged edges or aliasing in rendered images.
  • AntialiasedLineEnable: Antialiased lines can be used to smooth the appearance of lines in the rendered image. However, antialiasing may depend on the graphics hardware, driver settings, or the specific rendering techniques, so setting this alone to true doesn't give us antialiasing.

After this, we create the rasterizerState and once in SetRenderState, we set the device context to use it.

When we render objects on the screen, we need to decide which objects appear in front and which ones are hidden behind others. The ID3D11DepthStencilState is a tool that helps us control this. It uses something called the "depth buffer" to keep track of the depth or distance of each pixel on the screen. Imagine you have a stack of transparent sheets and you want to draw different objects on each sheet. The ID3D11DepthStencilState is like a mask or stencil that helps us control where and how things are drawn on each sheet. For example, it allows us to create effects like transparency. We can use the depth information to determine which objects should appear transparent, allowing us to see what's behind them. It also helps with object occlusion, where one object blocks another. By using the depth buffer, we can make sure that only the visible parts of objects are drawn, and the hidden parts are properly obscured.

By using the ID3D11DepthStencilState in combination with the depth buffer, you can control the rendering process and ensure that only the visible portions of objects are rendered while hidden or obscured portions are not drawn. This helps optimize rendering performance and achieve realistic visual effects.

C#
private ID3D11DepthStencilState depthStencilState;

public void CreateResources()
{
	...
	DepthStencilDescription depthStencilDescription = 
		new DepthStencilDescription(true, DepthWriteMask.All, ComparisonFunction.LessEqual)
	{
		StencilEnable = false,
		StencilReadMask = byte.MaxValue,
		StencilWriteMask = byte.MaxValue,
		FrontFace = DepthStencilOperationDescription.Default,
		BackFace = DepthStencilOperationDescription.Default
	};
	depthStencilState = device.CreateDepthStencilState(depthStencilDescription);
}

public void SetRenderState()
{
	...
	deviceContext.OMSetDepthStencilState(depthStencilState, 1);
	depthStencilState.Dispose();
}
  • The first parameter true enables depth testing.
  • DepthWriteMask set to All means that all pixels rendered will write their depth values to the depth buffer.
  • ComparisonFunction set to LessEqual means a new pixel will be drawn if its depth value is less than or equal to the existing depth value in the depth buffer.
  • StencilEnable sets the stencil testing to disabled. Stencil testing is a technique used to control and manipulate rendering based on stencil values, but in this case, it is turned off.
  • StencilReadMask and StencilWriteMask: These lines set the stencil read and write masks to allow the maximum range of stencil values (0 to 255). The stencil mask determines which bits of the stencil buffer are used for reading and writing stencil values.
  • FrontFace and BackFace set to DepthStencilOperationDescription.Default: The depth-stencil operation determines how the depth and stencil values are updated during rendering for front-facing and back-facing polygons. Default means that the existing depth and stencil values in the depth-stencil buffer are preserved without any changes being made.

Then, create the depthStencilState and set the device context to use it in SetRenderState. Running the application should show our 3D monkey in a splendid wireframe mode.

Matrices

I want to add a camera to our scene. Currently, even though we don't have an explicitly defined camera, a default view transformation is applied implicitly by Direct3D11. This default view transformation assumes an observer at the origin (0, 0, 0) looking down the negative z-axis. By defining a camera and specifying its view and projection matrices, we can have more control over how the 3D model is displayed in the viewport, including its position, orientation, field of view, and other camera-related parameters. But a camera requires a bit of setup. Let's start by talking a bit about matrices.

A Matrix4x4 is a rectangular grid made up of numbers (elements). Each element within the matrix represents a specific aspect of a transformation. Here's what these aspects refer to:

  • Translation: It represents the movement of an object from one position to another in 3D space. It includes shifting the object along the x, y, and z axes.
  • Rotation: It describes the rotation of an object around different axes (such as x, y, and z). Rotation changes the orientation or facing direction of the object.
  • Scaling: It defines the resizing of an object along the x, y, and z axes. Scaling alters the size of the object without changing its shape or orientation.
  • Projection: It refers to the transformation that maps a 3D object onto a 2D plane (like a computer screen). Projection accounts for perspective and determines how objects appear based on their distance from the viewer.

The elements of the Matrix4x4 are organized in a specific order and arrangement that allows for efficient calculations and transformations. By applying mathematical operations (such as matrix multiplication) between a Matrix4x4 and a 3D point or object, you can efficiently apply these transformations to the point or object.

The numbers in a 4x4 matrix can vary depending on what kind of transformation you want to do (translation, rotation, or scaling). Example:

Translation (movement): If you want to move an object in 3D space, you will modify the last column of the 4x4 matrix. For example, if you want to move an object 5 units to the right (along the X axis), 3 units up (along the Y axis), and 2 units forward (along the Z axis), your 4x4 matrix would look something like this:

text
 1.0  0.0  0.0  5.0
 0.0  1.0  0.0  3.0
 0.0  0.0  1.0  2.0
 0.0  0.0  0.0  1.0

The numbers in the diagonal (from top left to bottom right) are "1.0", and the numbers in the last column represent the amount of movement along each axis.

Scaling (resizing): If you want to resize an object, you'll change the numbers in the diagonal. For example, if you want to make an object twice as wide (along the X axis), half as tall (along the Y axis), and keep the depth the same (along the Z axis), your matrix would look like this:

text
 2.0  0.0  0.0  0.0 
 0.0  0.5  0.0  0.0 
 0.0  0.0  1.0  0.0 
 0.0  0.0  0.0  1.0 

The numbers in the diagonal represent the scale factor for each axis.

Rotation: Rotation is a bit more complex because it involves some trigonometry (sine and cosine functions). The numbers to perform rotation will appear in the 3x3 sub-matrix at the top left of the 4x4 matrix. Depending on which axis you're rotating around, different elements of the matrix will be used.

If all this seems complicated, don't worry. We don't normally interact directly with the matrices. Rather, we just tell them what to do, and the numbers do their magic behind the scenes. But it's still good to know what's going on if only on a basic level.

Setting Up The Camera

Unfortunately, Matrix4x4 is an ambiguous type between System.Numerics and Assimp. We want to use the Numerics one, so we can add a type definition at the beginning of the file, so that we don't have to explicitly write open the namespace every time we want to use a Matrix4x4.

C#
using Matrix4x4 = System.Numerics.Matrix4x4;

For a basic camera setup, we need three 4x4 matrices.

C#
private Matrix4x4 worldMatrix;
private Matrix4x4 projectionMatrix;
private Matrix4x4 viewMatrix;

World Matrix (or Model Matrix) represents the transformations applied to a 3D model in the world. It can include translations (moving the model around), rotations (changing the model's orientation), and scaling (adjusting the model's size). Even if you don't want to transform the model, you'd typically still use an identity world matrix (which effectively means "no transformation"). Without a world matrix, your model will always be at the origin of the world coordinates and can't be moved, rotated, or scaled.

View Matrix (Camera) represents the camera's position and orientation in the 3D world. It transforms all the world's objects relative to the camera, rather than the world origin. Essentially, it moves the entire world around so that the camera is at the origin. This is equivalent to saying it defines the camera's viewpoint. Without a view matrix, your camera is stuck at the origin and can't be moved or rotated.

Projection Matrix transforms 3D coordinates into 2D coordinates, simulating the way a real camera or the human eye perceives depth, where objects that are farther away appear smaller. This is how we get the illusion of 3D on a 2D screen. Without a projection matrix, you won't get a realistic 3D perspective; everything would appear in orthographic (parallel) projection, with no sense of depth.

First, let's define the projection matrix in CreateResources method.

C#
float aspectRatio = (float)SwapChainCanvas.Width / (float)SwapChainCanvas.Height;
float fov = 90.0f * (float)Math.PI / 180.0f;
float nearPlane = 0.1f;
float farPlane = 100.0f;
projectionMatrix = Matrix4x4.CreatePerspectiveFieldOfView(fov, aspectRatio, nearPlane, farPlane);
  • aspectRatio is the width-to-height ratio of your screen (or viewport, to be more precise). It's simply the width divided by the height. The aspect ratio helps ensure that your 3D scene doesn't get stretched or squished when it's projected onto the 2D screen.
  • fov stands for field of view. It is the extent of the observable scene that is seen at any given moment, or simply, how wide the camera lens is. A larger field of view means you can see more of the scene, but it can also create a "fisheye" lens effect. We're setting it to 90 degrees with a conversion formula radians = degrees * (π / 180).
  • nearPlane and farPlane define the distances from the camera where the 3D scene starts and ends. Anything closer than the nearPlane or further than the farPlane won't be rendered. This is known as "clipping." The chosen values mean that we'll be able to see everything from 0.1 units in front of the camera up to 100 units away.
  • projectionMatrix will be used to transform the 3D coordinates of our scene into 2D coordinates on the screen, taking into account the field of view, aspect ratio, and near/far planes we've defined. This matrix will be used whenever we render the scene, to convert the 3D world into a 2D image that we can display. This is like telling the camera how to see the world.

Next, let's set up the view matrix for our camera.

C#
Vector3 cameraPosition = new Vector3(-1.0f, -1.0f, -5.0f);
Vector3 cameraTarget = new Vector3(0.0f, 0.0f, 0.0f);
Vector3 cameraUp = new Vector3(0.0f, 1.0f, 0.0f);
viewMatrix = Matrix4x4.CreateLookAt(cameraPosition, cameraTarget, cameraUp);
  • cameraPosition: Imagine standing in a large, empty room. This line sets where in the room you, as the camera, are standing. The numbers here (-1.0f, -1.0f, -5.0f) represent the coordinates (x, y, z) of your location. You're standing one step to the left, one step down, and five steps backwards from the center of the room.
  • cameraTarget: Now that you're standing in the room, where are you looking? This line sets that. Here, you're looking straight at the center of the room, the origin (0.0f, 0.0f, 0.0f).
  • cameraUp: Imagine that you're standing in the room wearing a hat. The tip of the hat points straight up, doesn't it? This line says the same thing. It tells the camera which direction is 'up' (in this case, straight up along the y-axis).
  • CreateLookAt(cameraPosition, cameraTarget, cameraUp): Here, we're saying, "Okay, we're standing here, looking there, with our hat pointing up. Let's get a view of the room." This generates a 'view matrix', a fancy set of calculations that helps us to transform the whole 3D scene based on our view.

With the projection matrix and view matrix set up, we can move to updating the vertex shader with a constant buffer in the next article.