Defold's ray casting is extremely inefficient (DEF-3230)


#1

The cost of creating a single ray cast request/response is much higher than the cost of an actual physics geometric query.
This is because raycasting in Defold made by sending and receiving messages. Messaging by itself are very costly system. It doubles with raycasts – one raycast results in two messages.
Look at this:

local post = msg.post
local ray_cast = physics.ray_cast
local random = math.random
local sin = math.sin
local cos = math.cos
local pi = math.pi

local from = vmath.vector3(100, 568, 0)
local to = vmath.vector3(300, 568, 0)
local groups = { hash("test") }

math.randomseed(os.time())

function update(self, dt)
   for _ = 1, 400 do
      --post("#", "test")
    ray_cast(from, to, groups)
      -- do
      --    local x = random()
      --    local y = random()
      --    local z = random()
      --    self.d = x + y + z
      --    self.a = sin(random() * pi)
      --    self.b = cos(random() * pi)
      --    self.c = (self.a + self.b) * dt
      -- end
   end
end

function on_message(self, message_id, message)
end

The solution is to make raycasts a direct function call. Without messages, without callbacks, only a synchronous, blocking function that returns the result of a raycast response.

BTW, the code inside do…end block is performance equivalent of msg.post() function. Think about it when you want to post message next time.


#2

Interesting, but posting 800 messages in a frame is a bit excessive. I get your point though but of course there is a cost for the engine to transition to/from lua.

If there is a use case where you want to cast a bunch of raycasts it might be interesting to add more batch style versions.

Anyway will keep it in mind for an expliration day to see if there is something we can do performance wise.


#3

We have DEF-3230 for adding synchronous raycasts.
This will help in many situations.

Mind you, that calling Lua<->C 800 times in one frame will never be cheap.


#4

But this is not me sending all those 800 messages. I just call physics.ray_cast 400 times per frame. The engine does the rest.

I use raycasting as main collision prevention technique. Please don’t mention physics-based solutions — this does not work. Defold does not offer any other collision detection system. So raycasting is the only option if you want to keep integration with the editor.
It is a good option. Almost any Unity platformer 2d asset use raycasts this way.
300…400 raycasts per frame is a pretty normal sutuation in any action-intensive platformer game.

Batching probably can help. It will result in one batch(~8-14 raycasts) per moving entity per frame. Batching large blasts (with many debris pieces) may be also possible.
Sending raycast responses in one packet(not one by one) will also help. Right now almost any code for response handling begins with this:

function raycast_response(self, message)
	self.responses[message.request_id] = message
	if message.request_id < self.total_ray_count then
		return
	end
	-- all responses are now collected, let's sort them out
	...
end

Also good addition — ability to redirect responses to another script component. This may help (sometimes) with logic and code readability.
I do this now by using __dm_script_instance__

__dm_script_instance__ = get_context(self.edge_checker_context)
ray_cast(ray_start, ray_end, GROUND, 1)
__dm_script_instance__ = self

But all of this looks like https://en.wikipedia.org/wiki/Kludge
The only real solution is synchronous raycasts, as @Mathias_Westerdahl says. Synchronous raycasts in turn will allow very nice code level optimizations, that will results in 40%…50% less raycasts. Right now we forced to cast all rays upfront.

Thanks for looking into this.


#5

Yes, but even if the raycasts was syncronous you would have to transition from lua to engine and back to lua for each raycast which will always be expensive.

So most likely there will have to be a combination of techniques and a thorough technical design.


#6

We weren’t objecting to the number of raycasts per se, but the number of Lua calls. It will still be beneficial to batch raycasts. The function we’ll implement should support making multiple queries in one call.

And, again, we’ve already decided to do this, we just need to fit it into the rest of the schedule.
Bu, personally, I think improving our physics api’s is very important (e.g. we’re working on physics scaling, and physics joints)


#7

Keep in mind that batching is useful only if you decide to stay with asynchronous raycasts.
With synchronous raycast batching never needed (or not possible), because every one next ray cast(ray length, ray start) based on information acquired from previous raycast(s) and sometimes may be skipped at all.


#8

Not always true in my own experience. For instance, personally, I’ve used 4-10 ray casts for a character. In that case I’d rather batch those, than call Lua 10 times.


#9

Yes, scripting engines have their own quirks.