This post is a request to obtain a better documentation about input events data for mouse position, because even if I read the manual, the API and many examples, I still have questions about it on how to do rightly actual things. So I think the manual should be rewritten about it, to give precision to real use cases. I think almost every defold-example are misleading examples regarding this subject.
I don’t want to be perceived as just whining about it, because I wrote this post after many readings, code reads, and experiments. Now I need that somebody at Defold team says definitely how I should do things, and hopefully put that in the Manual for everybody.
I also see many many questions about this in the forum, so I thought it was time to resolve it once and for all.
So I read again and again the pages in Device input in Defold and Mouse and touch input in Defold , then I had to search the API to find this page API reference (Game object) (for game objects) and this page for gui API reference (GUI)
My first complaint is that in the manual there is almost no direct links to the API reference, I mean in context and to the good function, not just “see go namespace”. For example, to know where to find definition of the “action” properties, you find a link only in the input overview at about a 3/4 of the page when talking about the on_input function.
But nothing on the other pages about specific input events (or in the first table where Actions are mentioned). So if you miss the first link, you ask yourself where the hell is the doc of action data? then you must search “action” in the API and try to find the good page. This is not user friendly, especially for newcomers.
Now about action data for mouse/touch events, here are my points why I think the manual and the examples are misleading:
When in a game object context
The manual and every examples use action.x and action.y instead of action.screen_x and action.screen_y + screen_to_world conversion. I know it briefly talks about screen_to_world conversion, but as an optional thing for some complex case…but
- as I have experienced, I don’t know when we use action.x and action.y in this context, because you almost always have a camera and need to take screen/window scaling into account… so we should always use screen_x and screen_y. Even if you don’t have explicit camera, there is scaling to take into account.
- So these examples ONLY works because the game area is not scaled and there is no camera, in a controlled environment to make it work. In many real player device, coding like that wont give a good result.
- so the manual teach a misleading practice and people just got problems as soon as they try to code a real game, and need to find answers elsewhere than in the Manual. Or have a tough time find information and trial/errors.
- I don’t understand what values are action.x and action.y in this context? and what is the use case to use these properties in a game object? Maybe I am missing something, please enlighten me.
Having said that, I also point that for multi-touch, the touch table doesn’t have screen_x and screen_y fields, nor screen_dx and screen_dy (at least not in the API documentation).
I don’t know how to make it work with camera.screen_to_world()… or should we recode it just for multi-touch? (I didn’t try it, just based on the documentation)
I will add that we don’t explicitly know the relation between action.x and action.screen_x:
- what’s the formula connecting these numbers? So we understand what’s happening.
When in a GUI context
The manual should also talk about mouse event in this context, because since GUI is in the screen space by definition, we should talk about it, I think.
-
I am not sure if, in this context, we can safely use action.x or we should always use action.screen_x too? for example when using gui.pick_node(..) ? Every examples use action.x … The API talk about “the x-coordinate” but which one x-coordinate?
-
Maybe in GUI we always have action.x == action.screen_x ?
- I think the action data the same in the gui.on_input() function and in the go.on_input() function, but is it?
- If yes, so action.x ~= action.screen_x even in GUI, so should we use always action.screen_x in GUI? Will it work with pick_node? or maybe we must use action.x in this context?
Next part is bonus:
When in a shader/material/mesh context
I know it seems not directly related to mouse event, but it is to “which coordinate system are we in?” questions. I found the manual is not really precise about World versus Local versus Screen coordinate in vertex and fragment shaders.
I have a runtime mesh using a material but I still don’t understand exactly when to use World (vertex) coordinate versus Local (vertex).
And should we send GO get_position or get_world_position to it if using a uniform data?
And I found out go.get_world_position() don’t return actual world position, but “the world position of last frame” and you cant set_world_position()…? I am a bit lost.
Should my mesh receive coordinate relative to my game object?
I made it work by trial and errors, but I don’t understand what made it works exactly…I am a bit lost about these subjects, so I don’t feel “empowered” and it is frustrating.
The informations are also exploded in the many component (material, mesh, model, GO etc..) so it is difficult to connect the dots for an actual use case.
Maybe more use case explained in the manual could be made?
Examples are a bit sparse at explaining stuff and are always focused on ONE particular point. A good tutorial for some complex but common use case would add a lot to the manual…
I mean there is a lot of examples given by the community, and I could do what I want by studying their code, thanks to all these people. But these don’t “explain” things: why we do that? were is it explained in the manual/api? How could we do it otherwise? they just show things we can do.
For example, a fairly complex end use case would be to draw a mesh with the mouse by creating vertex at touch point and also detecting mouseovering a vertex of the mesh, with highlighting through shader fragment computation.
final note
This was a long post… I am complaining, but that’s because I love this engine
. Right now I am a bit frustrated not because I can’t do the thing I want in the end, but because I am loosing time with trial and errors because the information is not in the doc or kind of “implicit”.
So I have to make many small projects just to be sure things behave like I “think” they behave based on the doc and experiments…then finding out it is not completely right.
And the more I know, the more I find some examples are misleading, but it makes me doubt about my knowledge because this is the official examples..
So here I am ![]()

