Detail Manual for input "action" for touch data: aren't every example bad ones?

Mathias · March 15, 2026, 3:40pm

This post is a request to obtain a better documentation about input events data for mouse position, because even if I read the manual, the API and many examples, I still have questions about it on how to do rightly actual things. So I think the manual should be rewritten about it, to give precision to real use cases. I think almost every defold-example are misleading examples regarding this subject.

I don’t want to be perceived as just whining about it, because I wrote this post after many readings, code reads, and experiments. Now I need that somebody at Defold team says definitely how I should do things, and hopefully put that in the Manual for everybody.
I also see many many questions about this in the forum, so I thought it was time to resolve it once and for all.

So I read again and again the pages in Device input in Defold and Mouse and touch input in Defold , then I had to search the API to find this page API reference (Game object) (for game objects) and this page for gui API reference (GUI)

My first complaint is that in the manual there is almost no direct links to the API reference, I mean in context and to the good function, not just “see go namespace”. For example, to know where to find definition of the “action” properties, you find a link only in the input overview at about a 3/4 of the page when talking about the on_input function.
But nothing on the other pages about specific input events (or in the first table where Actions are mentioned). So if you miss the first link, you ask yourself where the hell is the doc of action data? then you must search “action” in the API and try to find the good page. This is not user friendly, especially for newcomers.

Now about action data for mouse/touch events, here are my points why I think the manual and the examples are misleading:

When in a game object context

The manual and every examples use action.x and action.y instead of action.screen_x and action.screen_y + screen_to_world conversion. I know it briefly talks about screen_to_world conversion, but as an optional thing for some complex case…but

as I have experienced, I don’t know when we use action.x and action.y in this context, because you almost always have a camera and need to take screen/window scaling into account… so we should always use screen_x and screen_y. Even if you don’t have explicit camera, there is scaling to take into account.
So these examples ONLY works because the game area is not scaled and there is no camera, in a controlled environment to make it work. In many real player device, coding like that wont give a good result.
so the manual teach a misleading practice and people just got problems as soon as they try to code a real game, and need to find answers elsewhere than in the Manual. Or have a tough time find information and trial/errors.
I don’t understand what values are action.x and action.y in this context? and what is the use case to use these properties in a game object? Maybe I am missing something, please enlighten me.

Having said that, I also point that for multi-touch, the touch table doesn’t have screen_x and screen_y fields, nor screen_dx and screen_dy (at least not in the API documentation).
I don’t know how to make it work with camera.screen_to_world()… or should we recode it just for multi-touch? (I didn’t try it, just based on the documentation)

I will add that we don’t explicitly know the relation between action.x and action.screen_x:

what’s the formula connecting these numbers? So we understand what’s happening.

When in a GUI context

The manual should also talk about mouse event in this context, because since GUI is in the screen space by definition, we should talk about it, I think.

I am not sure if, in this context, we can safely use action.x or we should always use action.screen_x too? for example when using gui.pick_node(..) ? Every examples use action.x … The API talk about “the x-coordinate” but which one x-coordinate?

image781×556 27.2 KB
Maybe in GUI we always have action.x == action.screen_x ?
- I think the action data the same in the gui.on_input() function and in the go.on_input() function, but is it?
- If yes, so action.x ~= action.screen_x even in GUI, so should we use always action.screen_x in GUI? Will it work with pick_node? or maybe we must use action.x in this context?

Next part is bonus:

When in a shader/material/mesh context

I know it seems not directly related to mouse event, but it is to “which coordinate system are we in?” questions. I found the manual is not really precise about World versus Local versus Screen coordinate in vertex and fragment shaders.

I have a runtime mesh using a material but I still don’t understand exactly when to use World (vertex) coordinate versus Local (vertex).
And should we send GO get_position or get_world_position to it if using a uniform data?

And I found out go.get_world_position() don’t return actual world position, but “the world position of last frame” and you cant set_world_position()…? I am a bit lost.

Should my mesh receive coordinate relative to my game object?

I made it work by trial and errors, but I don’t understand what made it works exactly…I am a bit lost about these subjects, so I don’t feel “empowered” and it is frustrating.
The informations are also exploded in the many component (material, mesh, model, GO etc..) so it is difficult to connect the dots for an actual use case.

Maybe more use case explained in the manual could be made?

Examples are a bit sparse at explaining stuff and are always focused on ONE particular point. A good tutorial for some complex but common use case would add a lot to the manual…
I mean there is a lot of examples given by the community, and I could do what I want by studying their code, thanks to all these people. But these don’t “explain” things: why we do that? were is it explained in the manual/api? How could we do it otherwise? they just show things we can do.

For example, a fairly complex end use case would be to draw a mesh with the mouse by creating vertex at touch point and also detecting mouseovering a vertex of the mesh, with highlighting through shader fragment computation.

final note

This was a long post… I am complaining, but that’s because I love this engine . Right now I am a bit frustrated not because I can’t do the thing I want in the end, but because I am loosing time with trial and errors because the information is not in the doc or kind of “implicit”.
So I have to make many small projects just to be sure things behave like I “think” they behave based on the doc and experiments…then finding out it is not completely right.

And the more I know, the more I find some examples are misleading, but it makes me doubt about my knowledge because this is the official examples..
So here I am

Mathias · March 15, 2026, 3:45pm

Just to illustrate the kind of problem people solve by trial and errors, for common use cases that I think should be in the Manual:

Mathias · March 15, 2026, 3:48pm

Finally I add I am willing to contribute to some documentation if somebody explains to me what the process is to do so (for the Manual or API doc.) please.
I am a RTFM people, I like to write things down and solve problems once and for all.

Halfstar · March 15, 2026, 4:32pm

For gui.pick_node(), using action.x is the correct choice. action.screen_x would in some cases not work for that. action.x is not in world space, but in screen space, and has the same origin as action.screen_x. The diference of action.screen_xis that is uses the scaling of the display, not the internal scaling of the game. You could simply convert between both if you know the screen scaling, it is just provided for convenience. Did you test any example from the manual, and found action.screen_x to work better? And screen_x is also provided for every finger in multitouch, it is just not mentioned in the documentation, so yes, that would be good change for the API reference.

Pawel · March 15, 2026, 4:38pm

@Mathias thank you so much for your feedback!

Our manuals are in the /doc repository:

There is a README guide on how to contribute. We mostly receive translation contributions, so that’s why it covers mostly this use case, but we would love of course to see improvements to the actual documents, even in form of raised issues. The manuals that are then pulled onto the Defold website, and that are in the repository all simple markdown files, are stored in: doc/docs/en/manuals at master · defold/doc · GitHub

You can simply fork the repository and contribute PRs, that we will be reviewing.
Since I joined the Defold team, my main task is to ease onboarding for users, and I’m striving to do so, so any help would be appreciated on this!

Another good repository for contributions is the one with examples:

In it, we store separate Defold projects with description markdowns and added scripts, from which we build a HTML5 live demo and pull into the website: Defold examples

I really want to unify all the examples one by one and add more, we got there some nice contributions e.g. from Artsiom @aglitchman for example. There is also a list of issues in the repository, where one can add an example willing to be made, or take one and add it yourself: GitHub · Where software is built

Process is the same - fork the repository, add the new project with example.md starting with a header:

---
tags: physics
title: Fixed timestep interpolation
brief: This example shows how to smooth physics motion in fixed update mode by interpolating a visual sprite while keeping the physics body fixed-step.
author: Defold Foundation
scripts: interpolation.script
thumbnail: thumbnail.png
---

And an optional thumbnail.png image (best is to have a screenshot from the actual example), and make a PR for review.

You can also look at the closed issues, to get to know how the PRs were created: Pull requests · defold/examples · GitHub

Perhaps, I’ll should make an update to the README of those repos with those notes for contributing