Premise

I have recently been attempting to improve my understanding of perspective projection. This included a variety of topics such as deriving a perspective projection matrix and understanding interpolation of vertex shader outputs in perspective. However, one topic that evaded me, was the surprising result that depth is interpolated as 1/z instead of z.

Before we jump in, I’m assuming the reader is familiar with the result of perspective projection in 3D graphics programming, frustums and some trigonometry.¹

Let’s take a look at a simple perspective projection matrix to understand what I mean. This is the perspective projection matrix we will be using for this blog post:

\begin{matrix}  2*n/(r - l)&0&0&0 \\  0&-2*n/(t-b)&0&0 \\  0&0&f/(f-n)&-n*f/(f - n) \\  0&0&1&0  \end{matrix}

To understand the perspective projection matrix, we’ll look at it as a collection of equations instead of simply looking at it in matrix form. I find that this clarifies what the matrix actually means.

Our equations have the form:

x = (2*n/(r-l) * x) + (0*y) + (0*z) + (0*w) \\  y = (0*x) + (-2*n/ (t-b)*y) + (0*z) + (0*w) \\  z = (0*x) + (0*y) + (f/(f-n)*z) + (-n*f(f-n)*w) \\  w = (0*x) + (0*y) + (1*z) + (0*2)

For the purpose of this post, we will hand-wave the equations for x and y and say that they simply squish the frustum into a box.²

Let’s focus on the equation for z.

z_{0} = (f/(f-n)*z) + (-n*f(f-n)*w)

Once we put it through perspective division (divide by z) our equation becomes

z_{1} = (f/(f-n)) + (-n*f(f-n)*w/z)

The main takeaway being that our depth is now equivalent to 1/z, why would we want that?

What is the problem?

If you’re like me, when reaching this result, you might think that it would be simpler to somehow avoid this result. After all, our depth is linear isn’t it?

Well, not really.

When digging to find out more about 1/z, I ran into a variety of answers why linear depth isn’t desirable.

  • You want more precision nearer to the camera, 1/z provides that property. (Note that this isn’t actually that desirable, 1/z loses precision very quickly³ and common wisdom is to reverse your depth and use a floating point depth buffer.⁴)
  • You don’t want to output linear z as a vertex output. Linear z can’t be linearly interpolated in screen space resulting in incorrect depth. 1/z is linear in screen space, z is not.

The second point is the core of this blog post.

Why is 1/z linear in screen space but not z? By the end of this post I will attempt to provide an intuitive perspective (heh) why this is true.

There was a variety of resources that provided me bits and pieces of understanding, and I wouldn’t have reached understanding without these.

If, like me, these help you get a vague understanding that 1/z is linear in screenspace but you still don’t really understand it, then this post is (hopefully) for you.

If you’re looking for a mathematical proof of this property, I recommend Scratchapixel’s excellent series here.

Reaching intuition

I introduce to you the simple grid.

EmptyGrid
Top-down screen space view after perspective projection. x is horizontal, z is depth

This is our frustum after perspective projection. We will only be looking at our problem in 2D from a top-down perspective with our x axis mapping to our horizontal axis and our z axis shooting into the screen.

Our first insight will come from assuming that the property that 1/z is linear in screen space is true. We’re going to add a few landmarks on our grid.

LandmarkGrid

We can see that if we join these new points, we get a straight line.

LineGrid

We can all agree that this line is linear in x and z.

If we then take this grid and transform it into our space before perspective projection, we get our original frustum.

PointsFrustum
Top-down world space view before perspective projection. x is horizontal, z is depth.

If we join our points we might notice something.

LineFrustum

Our points are no longer linearly related! If we introduce more and more points, our line approaches a curve.

This curve looks a lot like 1/z!

Can we now prove that this is only a property of z and not of x? If we simply draw a line that only varies in x and a line that only varies in z in our grid and transform it back to our original frustum.

In order to prove the inverse, we will draw landmarks in our frustum and transform them into our grid.

As you can see, our points are no longer connected linearly. This is correct, our z is indeed being curved. But the GPU can’t interpolate between the 2 end points if we keep z as it is (Interpolating a curve isn’t as easy as a line!). Instead, we want to use 1/z which is linear in screen space (as we’ve seen earlier), interpolate it and then convert it back to our z afterwards. This allows the GPU to easily linearly interpolate the 2 depth values with the same mathematics it would use to linearly interpolate the x value.

Notice what happens if the GPU takes our 1/z value, interpolates it linearly and then converts it to z after our interpolation.

The intersection remains the same. An essential property of our depth occlusion algorithm.

Now notice what happens if the GPU takes our z value and interpolates it linearly.

Yikes… The intersection is completely different. We would end up seeing far more of both lines than we should after our depth occlusion!

OcclusionIncorrectWithCorrectGrid
1/z linear interpolation converted back to z after interpolation (blue) and direct z linear interpolation (purple).

Conclusion

Hopefully, at this point it’s a little clearer why 1/z is linear in screen space and why this is a desirable property. I recommend you play around with these transformations and grasp why perspective projection behaves as it does. Grab a piece of paper, draw some landmarks, move them back and forth between the 2 spaces and connect the dots, it’s a lot of fun and an excellent way to gain an even deeper intuition of this property.

For a more thorough breakdown of the properties of the depth buffer and vertex output interpolation, I recommend reading the links in the resources. They provide excellent resources for the mathematics and reasoning behind a lot of these properties.

I would love to hear how others reason about this themselves. It was fun to figure this out, and I hope to be able to write more of these blog posts in the future.

Footnotes

¹ For more information see https://www.scratchapixel.com/lessons/3d-basic-rendering/perspective-and-orthographic-projection-matrix/projection-matrix-introduction

² See https://www.shadertoy.com/view/3lSXWW to play around with the transformation in 2 dimension.

³ See https://www.sjbaker.org/steve/omniv/love_your_z_buffer.html and play around with the calculator to gain an appreciation for how quickly precision falls off.

⁴ See https://outerra.blogspot.com/2012/11/maximizing-depth-buffer-range-and.html for an excellent overview of depth buffer precision and how to manage it.

Resources