Why does rendy.screen_to_world() take a 3D vector?


A screen position in an action passed to on_input() has an x and y coordinate but rendy.screen_to_world() wants a 3D vector. Why is this?

Setting different values of z will give very different results.

For context, I am mapping a cursor sprite to the 3D world using rendy.screen_to_world() by setting the z to 1000 which works fine. But I also want to do hit-testing and then I have a problem with converting the target sprite positions into the same coordinate space. I thought I could just convert everything into screen space but that does not seem to work for me and I wonder if it is this strange z component.

All help appreciated. Thank you in advance.

Does screen_to_world works for you? It just fails for me, strange :slight_smile:

It works if I set the screen coordinate z to 1000 (the far clipping plane) but I’m not sure why you need to do that.

The comment in rendy.lua says:

-- The screen position's z component maps to the camera frustum's z component.

I do not understand the comment :frowning:

Doesn’t it mean you have to set screen coordinate z in between camera near and far value?

¯_(ツ)_/¯ I genuinely do not know.

Why do you need to provide a z at all as the camera knows what its frustum is? And if it is really setting the world z you want in the output coordinate that is not clear in either the comment or the API.
After some tests it is clear that the output z is not the same as the input z.

I have extracted the code (and added some comments) to try to understand it more and make the API clearer…

local function screen_to_world(camera, screen_x, screen_y, input_z)
	local function is_within_viewport()
		camera.viewport_pixel_x <= screen_x and
		screen_x <= camera.viewport_pixel_x + camera.viewport_pixel_width and
		camera.viewport_pixel_y <= screen_y and
		screen_y <= camera.viewport_pixel_y + camera.viewport_pixel_height

	if not is_within_viewport() then return end
	local inverse_frustum = vmath.inv(camera.frustum)
	-- convert to clip space 
	local clip_x = (screen_x - camera.viewport_pixel_x) / camera.viewport_pixel_width * 2 - 1
	local clip_y = (screen_y - camera.viewport_pixel_y) / camera.viewport_pixel_height * 2 - 1

	-- make both a near and a far position by projecting back to world space with different z's
	local near_world_position = vmath.vector4(inverse_frustum * vmath.vector4(clip_x, clip_y, -1, 1))
	local far_world_position = vmath.vector4(inverse_frustum * vmath.vector4(clip_x, clip_y, 1, 1))

	-- scale it back so w == 1
	near_world_position = near_world_position / near_world_position.w
	far_world_position = far_world_position / far_world_position.w

	-- this is a ratio for lerping
	local frustum_z = (input_z - camera.z_min) / (camera.z_max - camera.z_min)

	-- when we lerp why don't we get back to the input z?
	local world_position = vmath.lerp(frustum_z, near_world_position, far_world_position)

	return vmath.vector3(world_position.x, world_position.y, world_position.z)

The question remains. What is the input_z doing? Why doesn’t the lerp function make the output_z the same? Was that the intention?

So I think I understand the function now and I’ve added these comments to my version:

– input_z should be within the clipping planes of the view frustum
– (e.g. between 0.1 and 1000 for the common values)
– this will get used to calculate the output z coordinate which looks inverted but that is
– because the camera is at 1000 and looking towards 0 so an input_z of 1000 is actually close to 0 in the world