Raycasting
The mouse is not just a single point on the screen. It is the accumulation of the many 3D positions that project onto that pixel. To figure out which object the mouse is over, we must turn the mouse position not into a single 3D position but rather a ray or line segment that passes through those many 3D positions. Then we must walk along this ray and determine which objects it intersects. This technique is called raycasting.
Explore the idea of shooting a ray through the mouse into the scene by clicking on this renderer:
Each time you click, a cylinder is added to the scene. It starts at the position on the near clipping plane that projects to the mouse cursor. It ends at the position on the far clipping plane that projects to the mouse cursor. Rotate the scene to see the complete cylinders.
The mouse listener in this renderer walks the mouse position backward through the transformation pipeline by inverting the transformations. It first turns the mouse's pixel space coordinates into normalized device coordinates:
function onMouseUp(event: MouseEvent) {
const mousePixel = new Vector2(
event.clientX,
canvas.height - event.clientY
);
const mouseNormalized = new Vector4(
mousePixel.x / canvas.width * 2 - 1,
mousePixel.y / canvas.height * 2 - 1,
-1,
1,
);
// ...
}
function onMouseUp(event: MouseEvent) { const mousePixel = new Vector2( event.clientX, canvas.height - event.clientY ); const mouseNormalized = new Vector4( mousePixel.x / canvas.width * 2 - 1, mousePixel.y / canvas.height * 2 - 1, -1, 1, ); // ... }
The mouse doesn't have a z-coordinate of its own. This code sets its z-coordinate to -1, which projects the mouse onto the near clipping plane of the viewing frustum. The homogeneous coordinate is set to 1 in order to treat the mouse coordinates as a position rather than a vector.
Next the normalized position is untransformed by the inverses of the matrices:
function onMouseUp(event: MouseEvent) {
// ...
let mouseEye = eyeFromClip.multiplyVector(mouseNormalized);
mouseEye = mouseEye.scalarDivide(mouseEye.w);
let mouseWorld = worldFromEye.multiplyVector(mouseEye);
let rayStart = mouseWorld;
}
function onMouseUp(event: MouseEvent) { // ... let mouseEye = eyeFromClip.multiplyVector(mouseNormalized); mouseEye = mouseEye.scalarDivide(mouseEye.w); let mouseWorld = worldFromEye.multiplyVector(mouseEye); let rayStart = mouseWorld; }
Remember how we skipped over clip space and the perspective divide earlier? To correct for that skipping, we must divide the eye space position by its homogeneous coordinate.
The ray starts at rayStart
. It heads to the rayEnd
position on the far clipping plane that projects to the mouse position. We find this position just as we did rayStart
. The only difference is that the z-coordinate of the normalized position is 1:
function onMouseUp(event) {
// ...
mouseNormalized.z = 1;
mouseEye = eyeFromClip.multiplyVector(mouseNormalized);
mouseEye = mouseEye.scalarDivide(mouseEye.w);
mouseWorld = worldFromEye.multiplyVector(mouseEye);
let rayEnd = mouseWorld;
// add in a cylinder spanning rayStart to rayEnd
}
function onMouseUp(event) { // ... mouseNormalized.z = 1; mouseEye = eyeFromClip.multiplyVector(mouseNormalized); mouseEye = mouseEye.scalarDivide(mouseEye.w); mouseWorld = worldFromEye.multiplyVector(mouseEye); let rayEnd = mouseWorld; // add in a cylinder spanning rayStart to rayEnd }
This renderer fits a cylinder between the rayStart
and rayEnd
positions. Some applications might use the two positions to figure out what object is being clicked on. The two positions are used to form a ray, and then objects in the scene are tested to see if they intersect the ray. Other applications might use the ray to direct a projectile launched by the user.