Well . . . I thought I had it working, it turns out it was a bit more complicated. But now I do actually have it working, after some help from my brother, and here’s the example project:
Link removed, see the next post
The tricky part is the z value. In that stack overflow answer, he just says get it with “glReadPixel” or “manually go from -1 to 1 ( zNear, zFar )”. What he neglects to say is that the -1 to 1 z value is not linear at all, it’s some funky curve. So you can’t just figure it out from your camera z pos and near/far settings in a simple way as you would expect. The easiest way is instead to take the points at the cursor x and y on the near and far planes (z= -1 and 1), transform those into world-space with the view*projection matrix, and then lerp between those to the z of your desired world plane (presumably 0, but it can be whatever).
The awesome part is, with this you can move and rotate the camera however you want and change the window size however you want, and it will still give you the correct world-space point under your screen-space cursor with no extra fuss. The example includes both the screen-to-world and the world-to-screen transforms, plus camera zoom and pan.
The only thing I didn’t get working is the world-to-screen transformation with changing window size and the GUI “Adjust Reference” mode enabled (“Per Node”). I’m not sure what’s going on there. If you set “Adjust Reference” to “Disabled” or just don’t resize the window, it works fine.