Cranberry Lair

Walking the programming parameter space

Diary of a Path Tracer – GGX-Smith and Multiple Importance Sampling — September 13, 2020

Diary of a Path Tracer – GGX-Smith and Multiple Importance Sampling



In our last post, we looked at cranberray’s sampling strategy and it’s implementation of N-Rooks sampling. In this post, we’re going to look at cranberray’s BRDF implementations and multiple importance sampling.

Shading Pipeline

Cranberray’s shading pipeling is very simple. When loading a mesh, ranges of triangles (or submesh) are tagged to use a specific material. When intersecting the geometry, cranberray will use the triangle’s index to lookup which material it should use for said triangle. Each material has an associated shader and appropriate data object. Once we know which material to use, we call the function associated to that shader and light our pixel. The heavy lifting happens inside the shader.

The primary lighting shader in cranberray is shader_microfacet. This function uses Lambert as it’s diffuse BRDF and GGX-Smith as it’s specular BRDF. It selects and blends both using multiple importance sampling.

In this post we’ll be taking a look at the magic behind the shader starting with our Lambert BRDF!

Understanding The Lambertian BRDF

For this post, we’ll assume that there is some familiarity with the fact that the Lambert BRDF is simply the Albedo of the surface divided by \pi.

However, one bit I’d like to touch on is that a Lambertian reflector emits energy at a constant rate in every direction.

At least, that was how I understood a Lambertian reflector. This is incorrect. A Lambertian reflector emits energy at a constant rate in every direction per unit area.

The difference here is that the first definition describes radiant intensity and the second defines radiance. [3]

What that means is that the amount of energy emitted is proportional to the angle at which the surface is viewed. Wikipedia was quite insightful here [1].

If you imagine a rectangular Lambertian reflector of 1 meter on each side and imagine that it reflects 1 unit of radiant intensity.

Notice that in that case, our Lambertian reflector is reflecting L = \frac{1}{1} = 1 units of radiance.

Now, imagine that we’re looking at the reflector at an angle of 60 degrees

Notice that the perceived size of our reflector is now 0.5 meters squared. If our reflector were to emit 1 unit of radiant intensity when viewed at this angle, it would emit L = \frac{1}{0.5} = 2 units of radiance. Twice as much energy per unit area! Meaning that if we looked at it at an angle, it would emit the same amount of energy for less area in our field of view making it brighter.

As a result, you can imagine that when we’re looking to render a surface that looks uniform when viewed from all directions this is problematic.

A Lambertian reflector should reflect constant radiance. As a result, the energy reflected per direction by our material should be proportional to the change in size of our reflector when viewed at an angle.

I’ll save you the trigonometry and tell you that this factor is cos(\theta) where \theta is the zenith angle of our viewer.

As a result, you can see that to emit constant radiance, our reflector’s radiant intensity should be reduced by cos(\theta) and if we do that, our reflector would emit 0.5 units of radiant intensity and as a result would emit L = \frac{0.5}{0.5} = 1 unit of radiance.

Now, like me, you might be thinking “If we need a cosine here to make our radiant intensity proportional to the foreshortening of our area, why don’t we multiply our emitted light by the cosine factor when rendering?”

The reason is that this cosine factor is relevant when we’re trying to calculate radiance from radiant intensity and the area being shaded. However, when we’re rendering, the result of our shading math is radiance not radiant intensity. If we wanted to instead calculate the radiant intensity, we would have to integrate across our surface.

I = \int_A L dA

Taking our previous surface as an example, when we’re shading it in our renderer, we’re asking “Given this irradiance, what is it’s radiance?”. In our previous example, our surface was being shaded with an irradiance of \pi (conveniently skipping over why here) and as a result, our radiance was 1. If we were to ask what the radiant intensity was in any given direction, you would have to multiply the radiance by the projected area of our surface. At an angle of 60 degrees, this would be 0.5, making our radiant intensity I = L * A = 1 * 0.5 = 0.5.

This was a long winded way to attempt at explaining a neat and surprising (to me) property of Lambertian reflectors.

Smith-GGX and microfacet BRFDs

Aside from our Lambert BRDF for diffuse, cranberray makes use of the Smith-GGX specular formulation. This formulation is very common and well documented. I refer the interested reader to [6] for an overview of the topic.

I won’t explain microfacet BRDFs here, but I highly recommend [6] and [9] for a derivation of the specular microfacet BRDF formula, and [8] for more info about the origins of the formula.

As presented in the heading, we’re making use of the Smith Uncorrelated function for our shadowing function and GGX for our NDF. [9]

There’s nothing particularly novel about this portion of the renderer. I recommend the interested reader see some of the linked references for information on implementing a similar shading model.

Importance Sampling

At this point in our rendering, we have a specular BRDF and a diffuse BRDF however we now need to decide on where to sample our ray.

The simplest method for selecting a ray is to uniformly select a point on the hemisphere centered about the normal of our surface.

This works well to get started but can converge very slowly. Especially in scenarios with sharp highlights which often happens with smooth surfaces.

If we sample only a few times, we might miss all the important details.

As a result, we want to make use of something called importance sampling. Simply put, importance sampling is the idea of trying to sample the most “important” parts of a function.

I assume the reader has some familiarity with Monte Carlo estimation method. If not, I suggested going to [15]. Their treatment of importance sampling and Monte Carlo estimators is excellent.

To sample the most important parts of our scene, we want to take samples where we expect the most contributions to our image would live. We unfortunately don’t know exactly where that is, but we can guess. And to guess, we can use an approximate function to choose what direction we want to send our ray.

Let’s imagine our basic Lambert BRDF, we know that rays that are closest to our normal will contribute the most light to our illumination. As a result, we can try to send more rays towards our normal’s direction with some falloff. [16]

To select a ray, we want to convert our PDF to a CDF and randomly select a point on our CDF using a uniform random number. [13] This allows us to select a random number for the specified distribution using a random number with a uniform distribution.

Once we’ve selected our value, we can calculate the PDF and plug it into our Monte Carlo estimator.

A potential downside of selecting an approximate function is that it might not accurately reflect the illumination in the scene, causing us to increase our variance instead of reducing it. [15]

Cranberray uses the BRDF functions as the approximate representation of the illumination in the scene. This is easy to implement and also quite effective since there tends to be quite a few rays that would effectively contribute nothing at all to the image if we sampled uniformly.

For Lambert, I use the derivation from a previous post [16].

And for GGX I use the derivation found here [13].

static cv3 hemisphere_surface_random_lambert(float r1, float r2)
	float theta = acosf(1.0f - 2.0f*r1) * 0.5f;
	float cosTheta = cosf(theta);
	float sinTheta = sinf(theta);
	float phi = cran_tao * r2;

	return (cv3) { sinTheta*cosf(phi), sinTheta*sinf(phi), cosTheta };

static float lambert_pdf(cv3 d, cv3 n)
	return cv3_dot(d, n) * cran_rpi;

static cv3 hemisphere_surface_random_ggx_h(float r1, float r2, float a)
	float cosTheta = sqrtf((1.0f-r1)*cf_rcp(r1*(a*a-1.0f)+1.0f));
	float sinTheta = sqrtf(1.0f - cosTheta*cosTheta);
	float phi = cran_tao * r2;
	return (cv3) { sinTheta*cosf(phi), sinTheta*sinf(phi), cosTheta };

static float ggx_pdf(float roughness, float hdotn, float vdoth)
	float t = hdotn*hdotn*roughness*roughness - (hdotn*hdotn - 1.0f);
	float D = (roughness*roughness)*cf_rcp(t*t)*cran_rpi;
	return D*hdotn*cf_rcp(4.0f*fabsf(vdoth));

Multiple Importance Sampling and blending our BRDFs

With these importance functions, we need a way to select between either our diffuse or our specular BRDF and blending between them. This is where Multiple Importance Sampling comes in.

Multiple Importance Sampling allows us to blend our functions using weights assigned to each PDF. [14]

In cranberray, we use a simple strategy for determining if we select from either our specular BRDF or our diffuse BRDF. We use the Fresnel factor for our geometric normal and select our specular BRDF when our random number is less than the Fresnel factor and our diffuse BRDF in the other case.

float geometricFresnel = cmi_fresnel_schlick(1.0f, microfacetData.refractiveIndex, normal, viewDir);
geometricFresnel = fmaxf(geometricFresnel, microfacetData.specularTint.r * gloss); // Force how much we can reflect at a minimum
float weights[distribution_count] =
	[distribution_lambert] = 1.0f - geometricFresnel,
	[distribution_ggx] = geometricFresnel

bool reflected = random01f(&context->randomSeed) < weights[distribution_ggx];

Once we’ve selected our BRDF and sampled from it, we want to blend our results with our previous sampling results. This is where multiple importance sampling comes in. (See [17] for an excellent treatment on the topic)

Notice that if we imagine our scene as a simplified function.

Let’s assume this function has an area of 1.2 units.

Imagine that our function is a combination of 2 simpler functions.

f(X) = g(X)h(X)

Let’s imagine that we select only one of those functions as the importance function uniformly at random and we select g(x).

To get our value using importance sampling, we need to divide by our PDF and the chance of selecting that PDF as our distribution (Which is 0.5 in this example).

If we were to use this single sample as our estimate using p_g as our PDF and the values g(x) = 0.1, h(x) = 2 and p_g(x) = 0.1 we would get y = \frac{g(x)h(x)}{0.5*p_g(x)} = \frac{0.1*2}{0.5*0.1} = \frac{0.2}{0.05} = 4 which is a poor estimate of our area of 1.2.

If we were to use p_h(x) and where p_h(x) = 1 as our importance function we would get y = \frac{g(x)h(x)}{0.5*p_h(x)} = \frac{0.1*2}{0.5*1} = \frac{0.2}{0.5} = 0.4 which is a closer estimate of our area of 1.2.(Admittedly, this example is a bit contrived)

The issue presented here (and in [17]) is that because our sample location had a low likelihood of being selected (0.1) but our second function h(x) had a relatively high value, our value gets a very high value as well since 2/0.1 is quite large. It’s important to note that if we were to continue taking samples, our estimator would still converge to 1.2.

Multiple Importance Sampling suggests a method to reduce the overall impact of these spikes to reduce the variance of our estimates by using a set of appropriate weighting functions instead of simply using our PDF directly as we just did.

The one sample model presented in [14] is what we’ll be using for this example, as it’s simpler to think about.

Instead of only dividing by our PDF, we want to apply a weighing function to our calculation.

F = \frac{w_I(X)f(X)}{c_Ip_I(X)}

Where I represents the index of our selected PDF, w_I represents the weight assigned to our PDF, c_I represents the probability that we select that PDF and p_I is our PDF function.

In our previous example, we did not have w_I(X). This function does all the magic.

[14] presents the balance heuristic as an option for a weighting function which has the form

w_i(x) = \frac{c_i p_i(x)}{\sum_k c_k p_k(x)}

Notice that this is simply a weighted average of our probability density functions.

If we add this function to our single sample model we get

F = \frac{c_i p_i(x)}{\sum_k c_k p_k(x)} \frac{f(x)}{c_ip_i(x)}

F = \frac{f(X)}{\sum_k c_k p_k(x)}

If we look at our previous example using p_g(x) as our PDF, we can make use of the same values here (notice that our final formulation does not divide by p_g(x) individually anymore.

c_g = 0.5, c_h = 0.5, p_g(x) = 0.1, g(x) = 0.1, p_h(x) = 1 and h(x) = 2

y = \frac{g(x)h(x)}{0.5*p_g(x)+0.5*p_h(x)}

y = \frac{g(x)h(x)}{0.5*p_g(x)+0.5*p_h(x)}

y = \frac{0.1*2}{0.5*0.1+0.5*1}

y = \frac{0.2}{0.55}

y = 0.36

Which although is not as good as our estimate of 0.4 from earlier, it is much better than our estimate of 4 when using p_g(x)

We can see that this is unbiased by noticing that

F(x) = \frac{g(x)h(x)}{c_g p_g(x) + c_h p_h(x)}

E[F(x)] = c_g p_g(x) \frac{g(x)h(x)}{c_g p_g(x) + c_h p_h(x)} + c_h p_h(x) \frac{g(x)h(x)}{c_g p_g(x) + c_h p_h(x)}

E[F(x)] = (c_g p_g(x) + c_h p_h(x)) \frac{g(x)h(x)}{c_g p_g(x) + c_h p_h(x)}

E[F(x)] = g(x)h(x)

(I took some shortcuts here, I recommend reading [14] for proofs)

An interesting property of the balance heuristic is that our values are bounded.

Notice that since

\frac{g(x)}{c_g p_g(x) + c_h p_h(x)} <= \frac{g(x)}{c_g p_g(x)}


\frac{g(x)h(x)}{c_g p_g(x) + c_h p_h(x)} <= \frac{g(x)h(x)}{c_g p_g(x)}

as well as

\frac{g(x)h(x)}{c_g p_g(x) + c_h p_h(x)} <= \frac{g(x)h(x)}{c_h p_h(x)}

Where both equations on the right were the formulas that we used originally when sampling either with p_g(x) and p_h(x) (Where c_g=0.5 and c_h = 0.5).

Meaning that our balanced sampling is bounded by the smaller of the 2 values generated by both of our PDFs. Neat! (Note that with uniform sampling, our value would simply be 0.2)

All that to say that this is our weighting code

float sum = 0.00001f;
for (uint32_t i = 0; i < distribution_count; i++)
	sum += weights[i] * PDFs[i];
weight = cf_rcp(sum);

And that’s it! That was a lot of words for a very little chunk of code.


This post is definitely a large one. But it was a lot of fun to learn enough about multiple importance sampling and the Lambertian BRDF to at least convey some sense of it in this blog post. I hope you enjoyed it! Next time we’ll be taking a look at the much simpler Obj loading code in cranberray. Until next time, happy coding!

Future Work

I’d like to make more use of a better importance sampling function for Smith-GGX as seen in [10]. I also wonder if it’s possible to only select from half vectors that reflect on the upper half of the hemisphere instead of requiring us to discard rays that would be generated in the lower half of our sphere.

I would also like to make more effective use of multiple importance sampling in the path tracer as a whole. At this moment, it is only used to combine our lambertian importance function and our GGX-Smith importance function.



















Derivation – Importance Sampling The Cosine Lobe — June 7, 2020

Derivation – Importance Sampling The Cosine Lobe


I’ve recently been diving into the world of importance sampling and I decided to share my derivation for importance sampling the cosine lobe.


When we’re shading a point in path tracing, we typically shoot a ray from our surface in a uniformity random direction contained on a hemisphere centered about our normal. This has the downside of introducing quite a bit of variance into our renders.

Imagine that we have a very bright light that only occupies a very small projected area from our shaded point.


We would be very likely to miss this light with most of our samples and our render could turn out much darker than we would expect.

This is where importance sampling comes in.

If you imagine that your illumination is a polar function


If we were to sample a random variable with a distribution that matches this function, we would be much more likely to hit the important points of our function. (Hence importance sampling)

I won’t dive deeply into this topic, as there are a variety of excellent resources detailing this topic. [1]

The essence of it however, is that you want to find a Probability Density Function (PDF) that matches the shape of your illumination function. Once you’ve define this PDF, you can sample it using the Cumulative Density Function (CDF).


Since our cosine lobe illumination will look like this:


We will use this as the basis to derive our distribution since we’re most likely to get the most light from directions arriving parallel to our normal.

Thankfully, our cosine lobe has an analytical formula that we can use as our PDF.

PDF(\omega) = C*cos(\theta) (1)

Our PDF must integrate to 1, we integrate the PDF across our hemisphere


\int_0^{2\pi}\int_0^{\frac{\pi}{2}}PDF(\omega)sin\theta d\theta d\phi

Plug in (1)

\int_0^{2\pi}\int_0^{\frac{\pi}{2}}C*cos\theta sin\theta d\theta d\phi

C*\int_0^{2\pi}\int_0^{\frac{\pi}{2}}cos\theta sin\theta d\theta d\phi

\int cos\theta sin\theta d\theta = -\frac{1}{4}cos2\theta

\int_0^{\frac{\pi}{2}}cos\theta sin\theta d\theta

-\frac{1}{4}cos\pi+ \frac{1}{4}cos0


\int_0^{\frac{\pi}{2}}cos\theta sin\theta d\theta=\frac{1}{2} (2)

Plug in (2)

C*\int_0^{2\pi}\frac{1}{2} d\phi


C*\int_0^{2\pi}\int_0^{\frac{\pi}{2}}cos\theta sin\theta d\theta d\phi=C*\pi (3)

Since our PDF has to integrate to 1,

\int_0^{2\pi}\int_0^{\frac{\pi}{2}}PDF(\omega)sin\theta d\theta d\phi = 1

Plug in (3),


C=\frac{1}{pi} (4)

Finally, plug in (4) into our PDF,

PDF(\omega) = \frac{cos(\theta)}{\pi} (5)

Now that we have our PDF, we can calculate our PDF in terms of \theta and \phi.

PDF(\theta,\phi)d\theta d\phi = PDF(\omega)d\omega

PDF(\theta,\phi)d\theta d\phi = PDF(\omega)sin\theta d\theta d\phi


PDF(\theta,\phi)=\frac{cos\theta sin\theta}{\pi} (6)

Now we integrate with respect to \phi to get PDF(\theta)

\int_0^{2\pi}\frac{cos\theta sin\theta}{\pi}d\phi = 2cos\theta sin\theta

PDF(\theta)=2cos\theta sin\theta

And then to get PDF(\phi),


\frac{cos\theta sin\theta}{2cos\theta sin\theta \pi}=\frac{1}{2\pi}


Now we want to calculate the CDF of each function,

CDF(\theta)=\int_0^\theta PDF(\theta) d\theta

CDF(\theta)=\int_0^\theta 2cos(\theta)sin(\theta) d\theta

CDF(\theta)=\int_0^\theta sin(2\theta) d\theta


CDF(\phi)=\int_0^\phi PDF(\phi) d\phi

CDF(\phi)=\int_0^\phi\frac{1}{2\pi} d\phi


Now we want to invert our CDF to sample it using our random variable y,





\frac{cos^{-1}(1-2y)}{2}=\theta (7)

For \phi,



y*2\pi = \phi (8)

Now we have our CDFs and our PDFs, we can finally calculate our direction.

In pseudo code you can simply do:



With these directions, you can now sample your scene:

\frac{SampleScene(SphericalTo3D(\theta, \phi))}{PDF(\omega)}

Plug in (5)

\frac{SampleScene(SphericalTo3D(\theta, \phi))\pi}{cos\theta}


That’s it! These formulas will sample the hemisphere where it is receiving more light defined by the cosine lobe. The results are pretty awesome.