Depth Map, Normal Map

Filed under: Notes — yasunobu13 at 11:18 pm on Wednesday, August 23, 2006

A lot of interesting graphics techniques require the use of depth maps and normals maps. Depth maps store information of how far a pixel is from the eye in the final rendered scene. Similarly, normal maps store the surface’s normal vector for the cooresponding pixel. One way to do this would be to draw the scene with vertex normals instead of vertex colors and copy the color and depth buffers into a texture. A more straight forward technique uses vertex and pixel shaders and gives us more control over the final maps. Also, if we use GL_EXT_framebuffer_object, we can avoid any nasty context switches that are associated with pBuffers.

The first thing we do is understand the layout of the maps. Really, we’ll only be rendering to a single texture. This has the advantage of saving space, but may cause a problem if we only have 8 bits per channel. If the normal {x, y, z} is stored in {r, g, b} and the depth stored in {a}, then we’ve gone from 32 bit floating point {x} to 8 bit {r}, or 24 bit depth to 8 bit {a}; the same goes for the other components. This loss in precision will show up in banding artifacts and the errors will be carried through for every computation that follows that map use. But sometimes this is unavoidable based on the platform that we’re using.

How do we fit a normal and depth into a single pixel? Well, we know which color component will hold which value, but we then find out that the GL will clamp our texture values to [0,1]. No negative values for us. Luckily, the depth is already clamped to [0,1] by virtue of the graphics pipeline; the normal is an entirely different problem. It does have an easy solution though. Normals can be normalized (duh), which means each component will map to [-1,1]. If we normalize, then multiply by two and subtract one, our mapping will be fine. We just need to undo the tranformation when we reference it later on.

This gives us our GLSL vertex shader:

varying vec3 normal;
varying float depth;
void main()
{
gl_Position = ftransform();
normal = gl_NormalMatrix * gl_Normal;

vec4 eyeTmp = gl_Position;
eyeTmp.xyz = eyeTmp.xyz / eyeTmp.w;
depth = eyeTmp.z;
}

And our fragment shader:

varying vec3 normal;
varying float depth;

void main()
{
vec3 N = normalize(normal);
N = 0.5 * N + 0.5;

gl_FragColor = vec4(N, depth);
}

Now, that may not be the absolute best way, but it works on nearly every platform that can use GLSL. One issue that I should bring up, is that the pixel’s depth value can be directly referenced in the fragment shader through gl_FragCoord.z; on some machines, reding this value was very, very slow, but now I’ve seen great improvement in reading the value. This means we can remove all references to the varying variable depth from both shaders.

Depending on your system, you may have acces to floating point textures through GL_ARB_texture_float. Two very nice things about that extension. First, floating point textures are not clamped by the GL. This means that messing around with the normals can be avoided. Second, we are no longer limited to 8 bits per channel. We now have access to 16 or 32 bits per channel which will reduce or remove banding issues completly.

Now what do our shaders look like? The vertex shader:

varying vec3 normal

void main(void) {
gl_Position = ftransform();
normal = gl_NormalMatrix * gl_Normal;
}

Fragment shader:

varying vec3 normal;

void main (void) {
gl_FragColor = vec4(normalize(normal), gl_FragCoord.z);
}

OpenGL Transforms and the Inverse Model View

Filed under: Notes, OpenGL — yasunobu13 at 9:39 am on Monday, August 14, 2006

Below, I mentioned that I stored the inverse modelview matrix in gl_TextureMatrix[0] . Why did I do this? Well, I needed to transform the vertices and vectors from eye space to world space. Unfortunately, OpenGL sets up the Model View matrix to transform from object space to camera space; the inverse of that would skip right over world space and go straight back to object space. Solution? Once the camera transform is set up, invert it and store it in the texture matrix. This way, we can transform something to camera space (which we usually do anyway), then transform it to world space by removing the camera transform. But first, a review of all the transforms OpenGL does.

Object Space to Image Space (Quick and Dirty)

First, how do we move from object space to image space in OpenGL? Well, we start off with a vertex V in object space, simple enough. Then, some transforms are applied to the object. Let’s say it gets rotated, translated, scaled, rotated again and translated once more. Each of these transforms can be described by their own matrix, but for simplicity’s sake, we’ll say that they were all multiplied into a single matrix – M. So, we have V, a vertex in object space, and now V * M, a vertex in world space.

Next, there’s the transformation from world space to camera space. In OpenGL, camera space consists of right = +X, up = +Y, and forward = -Z. This can be done using the command gluLookAt with the camera’s location, point of focus and up vector. We’ll call this matrix C. Moving a vertex from object space to camera space is then V * M * C, seeing the pattern?

Once we are in camera space, we use the projection matrix (specified with functions like glFrustum and gluOrtho2D) to again change the coordinate space. We’ll call this matrix P, and the sequence V * M * C * P will tell us which vertexes will be clipped. Once in this space (and after the perspective divide), if x or y coordinates of the vertex are outside [-1,+1] or if the z coordinate is outside [0, +1], then the vertex is clipped.

So, what do we have? Taking a vertex V in object space, multiplying it by M moves it to world space, multiplying that by C moves it to camera space, multiplying it P moves it to projected space. V * M * C * P.

Model View Matrix: Object Space to Camera Space

The matrix created to move a vertex to world space is defined by the various transforms applied to that vertex in OpenGL through functions like glTranslate, glRotate and glScale. Loading on the transforms creates a single matrix to change to world space. It’s also worth noting that the final matrix has an inverse and it is very easy to create if you know the tranforms that were used to create it. Let’s say you translate something by 10, rotate it by 90 and scale it by 0.5. This results in a 4×4 matrix where the original transforms have been muddled together to do the entire series of transforms at once. The inverse of this matrix would have to undo each of those transforms in the proper order. So the inverse can be constructed by scaling by 2.0, rotating by -90 and translating by -10. Easy as pie.

The camera space transform can be easily specified by the function gluLookAt. That link contains the implementation to create the matrix for OpenGL. Here’s a sampler to see implementation using GLSL like C++ types.

mat4 makeGluLookAt(vec3 eye, vec3 center, vec3 up)
{
	// forward pointing vector
	vec3 f(center - eye);
	f.normalize();
	up.normalize();

	// right pointing vector
	vec3 s(f.cross(up));

	// orthonormal up vector
	vec3 u(s.cross(f));

	s.normalize();
	u.normalize();

	// construct orthonormal orientation transfom
	mat4 Orient(
		s.x, s.y, s.z, 0.0f,
		u.x, u.y, u.z, 0.0f,
		-f.z, -f.y, -f.z, 0.0f,
		0.0f, 0.0f, 0.0f, 1.0f
	);

	// translate the new coordinate system to the origin
	mat4 Translate(
		1.0f, 0.0f, 0.0f, -eye.x,
		0.0f, 1.0f, 0.0f, -eye.y,
		0.0f, 0.0f, 1.0f, -eye.z,
		0.0f, 0.0f, 0.0f, 1.0f
	);

	return Orient * Translate;
}

If you pay close attention to the Orient matrix, you’ll see that the rows represent the right, up and -forward that will correspond to +X, +Y and +Z (remember that the OpenGL camera looks down -Z). This is a standard change of orientation matrix; this will effectively rotate the world around so that the +X, +Y, and +Z axis line up with the right, up and -forward vectors of the camera. We then translate by the position of the camera to move the old origin to the camera’s origin.

One sticky part to note about all of this is that we have been working with row-wise matrices that are intended to be post multiplied. That is, we start with the vertex, then multiply by the transforms, then multiply by the camera matrix. But, OpenGL uses pre-multiplied matrices to achieve the same effect (that’s why you clear the model view matrix, then specify the camera, then specify the transforms and finally the vertex last). In order to end up with the same matrices in the end, OpenGL needs to use transposed matrices, or column-wise matrices. This means that our function needs to be reworked. One important point to remember: the transpose of multiplied matrices is the same as multiplying the transpose of the matrices in reverse order, ie. (A * B)T = BT * AT. Here is the correct OpenGL Friendly code.

mat4 makeGluLookAt(vec3 eye, vec3 center, vec3 up)
{
	vec3 f(center - eye);
	f.normalize();
	up.normalize();

	vec3 s(f.cross(up));
	vec3 u(s.cross(f));

	s.normalize();
	u.normalize();

	mat4 Orient(
		s.x, u.x, -f.x, 0.0f,
		s.y, u.y, -f.y, 0.0f,
		s.z, u.z, -f.z, 0.0f,
		0.0f, 0.0f, 0.0f, 1.0f
	);

	mat4 Translate(
		1.0f, 0.0f, 0.0f, 0.0f,
		0.0f, 1.0f, 0.0f, 0.0f,
		0.0f, 0.0f, 1.0f, 0.0f,
		-eye.x, -eye.y, -eye.z, 1.0f
	);

	return Translate * Orient;
}

It is important to realize that OpenGL’s Model View matrix is actually M * C, so, if we want to get the position of the vertex in world space, we will have to use the modelview matrix to go to camera space, then multiply by C-1 (the inverse of the camera matrix). This gives us V * M * C * C-1 = V * gl_ModelViewMatrix * C-1 = V * M. And world space is exactly what we need to do shadow mapping, refraction, reflection and a plethora of other things. OpenGL took care of this for us with the GL_ARB_Shadow extension. By using glTexGen, OpenGL computes the inverse camera matrix and applies it without us having to fuss with inverses. But, now we’ll need to do an inverse ourselves.

Inverse gluLookAt

Remember that the camera matrix is given to us by gluLookAt, and above, we have the exact implementation of how the matrix is created. If you know your matrices, you may notice some properties that the camera matrix adheres to when being constructed this way. Normally, for a generic matrix, the inverse is a slow and painful process. You really don’t want to invert a lot of matrices in order to run your program in real time. Luckily, the camera matrix isn’t just a generic matrix, it has a specific construction with easily invertable properties.

If you paid attention to the construction of the Orient matrix, you would have seen that the columns represent an orthonormal coordinate system. That is, the vectors, s, u and -f are all at 90 degree angles to each other and their lengths are equal to 1. The property that this gives us is that the dot product of any two of those vectors equal 0 when the vectors are different. When the vectors are the same, the dot product is the length of the vector squared, or 1.

Now, imagine a matrix multiply as the dot product of two vectors, the row of the first matrix and the column of the second matrix. What we want to do is construct the inverse of Orient such that Orient * Orient-1 equals the identity matrix (0’s everywhere with 1’s on the diagonals), see where I’m going? As it turns out, the inverse of a matrix that represents an orthonormal coordinate system is exactly it’s transpose.

mat4 Orient(
	s.x, u.x, -f.x, 0.0f,
	s.y, u.y, -f.y, 0.0f,
	s.z, u.z, -f.z, 0.0f,
	0.0f, 0.0f, 0.0f, 1.0f
);

mat4 OrientInverse(
	s.x, s.y, s.z, 0.0f,
	u.x, u.y, u.z, 0.0f,
	-f.x, -f.y, -f.z, 0.0f,
	0.0f, 0.0f, 0.0f, 1.0f
);

But the camera matrix also contains a translation that gets multiplied to Orient. We know the inverse of a translation is just a negative translation, so that’s easy. So, if C = Translate * Orient, then C-1 = (Translate * Orient)-1 = Orient-1 * Translate-1. Hey, we know all those pieces! Let’s put it into OpenGL friendly code.

mat4 makeGluLookAtInverse(vec3 eye, vec3 center, vec3 up)
{
	vec3 f(center - eye);
	f.normalize();
	up.normalize();

	vec3 s(f.cross(up));
	vec3 u(s.cross(f));

	s.normalize();
	u.normalize();

	mat4 OrientInverse(
		s.x, s.y, s.z, 0.0f,
		u.x, u.y, u.z, 0.0f,
		-f.x, -f.y, -f.z, 0.0f,
		0.0f, 0.0f, 0.0f, 1.0f
	);

	mat4 TranslateInverse(
		1.0f, 0.0f, 0.0f, 0.0f,
		0.0f, 1.0f, 0.0f, 0.0f,
		0.0f, 0.0f, 1.0f, 0.0f,
		eye.x, eye.y, eye.z, 1.0f
	);

	return OrientInverse * TranslateInverse;
}

That is probably the most difficult part of doing shadow mapping and other real time techniques (once you get past all of the theory, that is). And look, it’s all wrapped up in a tiny function! Store that into a texture matrix and your shaders have quick and easy access to world space.

Refraction: Part 1

Filed under: OpenGL, Project — yasunobu13 at 8:57 pm on Thursday, August 3, 2006
One Bounce Refraction

So, that engine I was working on; I made quite a bit of progress and decided to try and actual project with it. The project is real time refraction using GLSL. The first step is very quick and easy.

One bounce refraction of an infinite environment.

First, create a skybox in order to create an infinite environment and render it to the screen. Next, enable the environment map (let’s assume it is stored as a texture cube map) and activate the GLSL program. Render the refractive object and disable everything you just enabled. The shaders I used are modified from the Orange Book.

First, the vertex shader. Two varying vec3s i stores the vertex position in eye space and n stores the vertex normal in eye space.

varying vec3 i;
varying vec3 n;

void main()
{
vec4 ecPosition = gl_ModelViewMatrix * gl_Vertex;

i = ecPosition.xyz / ecPosition.w;
n = gl_NormalMatrix * gl_Normal;

gl_Position = ftransform();
}

Now, the fragment shader. Using i and n we find the vector that refracts off the surface of the object. This vector is multiplied by gl_TextureMatrix[0] which holds the inverted modelview matrix. This converts the refracted vector from eye space into world space. Using that vector to index into the cube map gives us our final color.

uniform samplerCube texture;
uniform float indexOfRefraction;

varying vec3 i;
varying vec3 n;

void main()
{
i = normalize(i);
n = normalize(n);

vec3 Refracted = refract(i, n, indexOfRefraction);
Refracted = vec3(gl_TextureMatrix[0] * vec4(RefractR, 1.0));

vec3 refractColor = vec3(textureCube(texture, RefractR));

gl_FragColor = vec4(refractColor, 1.0);
}

If you look at the Orange Book example, you see a few extra features to make the refracting object look more realistic. The first is the Fresnel Effect. This is when you view the refracting object at such an angle that you will actually see a reflection instead. Next is diffraction, or chromatic abberation. We boil it down to supplying a slightly different index of refraction for each color channel.

The vertex shader stays the same, but the fragment shader changes slightly. We find a different refraction vector for each color channel, and also a reflection vector. Look up the colors and mix them together based on the fresnel factor.

uniform samplerCube texture;
uniform vec4 indexOfRefraction; // {R, G, B, Fresnel}

varying vec3 i;
varying vec3 n;

const float FresnelPower = 5.0;

void main()
{
i = normalize(i);
n = normalize(n);

float Ratio = indexOfRefraction.a + (1.0 - indexOfRefraction.a) * pow((1.0 - dot(-i, n)), FresnelPower);

vec3 RefractR = refract(i, n, indexOfRefraction.r);
RefractR = vec3(gl_TextureMatrix[0] * vec4(RefractR, 1.0));

vec3 RefractG = refract(i, n, indexOfRefraction.g);
RefractG = vec3(gl_TextureMatrix[0] * vec4(RefractG, 1.0));

vec3 RefractB = refract(i, n, indexOfRefraction.b);
RefractB = vec3(gl_TextureMatrix[0] * vec4(RefractB, 1.0));

vec3 Reflect = reflect(i, n);
Reflect = vec3(gl_TextureMatrix[0] * vec4(Reflect, 1.0));

vec3 refractColor, reflectColor;

refractColor.r = vec3(textureCube(texture, RefractR)).r;
refractColor.g = vec3(textureCube(texture, RefractG)).g;
refractColor.b = vec3(textureCube(texture, RefractB)).b;

reflectColor = vec3(textureCube(texture, Reflect));

vec3 color = mix(refractColor, reflectColor, Ratio);

gl_FragColor = vec4(color, 1.0);
}