Camera doesn’t exist in the 3D world. It’s just our perception. We fake that perception by creating a matrix.
What?
In this post, I will derive the gluLookAt function, which is used to create a view matrix that positions and orients a camera in 3D space. This is one of the most important transformations in 3D graphics.
Understanding the gluLookAt Function
The gluLookAt function takes three parameters:
eye
: The position of the camera (where we’re looking from)target
: The point we want to look atup
: The up vector that defines the camera’s orientation
Our goal is to create a matrix that transforms world coordinates into camera coordinates, where:
- The camera is at the origin
- The camera is looking down the negative z-axis
- The up direction is along the positive y-axis
Step 1: Creating the Camera Basis Vectors
First, we need to create three orthogonal vectors that define our camera’s coordinate system:
-
Forward Vector (z-axis): Points from the camera to the target \(\vec{f} = \text{normalize}(\text{target} - \text{eye})\)
-
Right Vector (x-axis): Perpendicular to both forward and up vectors \(\vec{r} = \text{normalize}(\vec{f} \times \text{up})\)
-
Up Vector (y-axis): Perpendicular to both forward and right vectors \(\vec{u} = \vec{r} \times \vec{f}\)
Why do we need to create a new up vector? Even though we’re given an up vector as input, we need to create a new one because:
- The input up vector might not be perfectly perpendicular to the forward vector
- We need an orthonormal basis (three mutually perpendicular unit vectors) for our camera coordinate system
- The new up vector ensures our camera’s coordinate system is properly oriented and doesn’t have any skewing or shearing
Why don’t we recompute the forward vector? The forward and up vectors serve different purposes:
- The forward vector is sacred: It defines what the camera is looking at. We compute it directly from
target - eye
because this is the fundamental purpose of the LookAt function. - The up vector is flexible: It’s just a hint for the camera’s tilt. We adjust it to be orthogonal to forward because its exact direction isn’t as critical as maintaining the correct view direction.
This is why we:
- Keep the forward vector as is (from target to eye)
- Use the given up vector only to compute the right vector
- Recompute the final up vector to ensure orthogonality
Step 2: Building the View Matrix
The view matrix needs to:
- Rotate the world so that the camera’s basis vectors align with the world axes
- Translate the world so that the camera is at the origin
The Camera Illusion In 3D graphics, there is no actual camera moving around the scene. Instead, we create the illusion of a camera by moving the entire world in the opposite direction. This is why:
- We multiply the view matrix first in the transformation pipeline
- We move the world by
-eye
instead of moving the camera byeye
- We rotate the world to align with our desired camera orientation
The rotation part of the matrix is constructed using the camera’s basis vectors:
\[R = \begin{bmatrix} r_x & r_y & r_z & 0 \\ u_x & u_y & u_z & 0 \\ -f_x & -f_y & -f_z & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\]The translation part moves the camera to the origin:
\[T = \begin{bmatrix} 1 & 0 & 0 & -eye_x \\ 0 & 1 & 0 & -eye_y \\ 0 & 0 & 1 & -eye_z \\ 0 & 0 & 0 & 1 \end{bmatrix}\]The final view matrix is the product of these two matrices:
\[\text{View} = R \times T\]Step 3: Combining Everything
Putting it all together, the LookAt function creates a matrix that:
- Translates the world by
-eye
to move the camera to the origin - Rotates the world using the camera’s basis vectors to align the camera with the world axes
The final matrix looks like this:
\[\text{View} = \begin{bmatrix} r_x & r_y & r_z & -\vec{r} \cdot \text{eye} \\ u_x & u_y & u_z & -\vec{u} \cdot \text{eye} \\ -f_x & -f_y & -f_z & \vec{f} \cdot \text{eye} \\ 0 & 0 & 0 & 1 \end{bmatrix}\]Why This Works
This matrix transforms any point in world space to camera space by:
- Moving the camera to the origin
- Aligning the camera’s view direction with the negative z-axis
- Aligning the camera’s up direction with the positive y-axis
When we multiply this matrix with any point in world space, we get its coordinates relative to the camera, which is exactly what we need for rendering.
The Transformation Pipeline The view matrix is typically the first transformation applied in the rendering pipeline:
- View Matrix: Moves and rotates the entire world to create the camera illusion
- Projection Matrix: Projects 3D coordinates onto a 2D plane
- Viewport Transform: Maps the projected coordinates to screen space
This order is crucial because:
- The view matrix affects everything in the scene
- Subsequent transformations build upon this camera-space coordinate system
- It’s more efficient to transform the world once than to transform every object individually
Implementation Notes
In practice, the LookAt function is often implemented as:
mat4 lookAt(vec3 eye, vec3 target, vec3 up) {
vec3 f = normalize(target - eye);
vec3 r = normalize(cross(f, up));
vec3 u = cross(r, f);
mat4 view = {
r.x, r.y, r.z, -dot(r, eye),
u.x, u.y, u.z, -dot(u, eye),
-f.x, -f.y, -f.z, dot(f, eye),
0, 0, 0, 1
};
return view;
}
This implementation efficiently combines the rotation and translation into a single matrix, which is what makes the LookAt function so powerful and widely used in 3D graphics.