A few performance-related questions

Hey guys, apologies in advance if these have been answered before. I did search the forum, and some of these topics have been discussed briefly in the past, but I wasn’t able to find the answers I was looking for.

1. Using go.get() instead of go.get_position()

If I understand correctly, the following has some performance implications since it needs to allocate a new vector3 object.

pos = go.get_position(".")

So do I understand correctly that the following would be faster?

pos_x = go.get(".", "position.x")
pos_y = go.get(".", "position.y")

Ditto for using go.set(), instead of go.set_position().

2. Storing non-property objects on self?

Suppose I want to store data on a script that don’t qualify to be Defold properties (e.g lists, strings, tables).
So far what I’ve been doing is something like the following:

go.property("health", 100)

function init(self)
    self.some_list = [ ... ]
    self.some_string = "hello world"
end

Sure I can’t access these custom properties using go.set() or go.get(), but I can access them from the other callbacks in the script and that’s fine for me.

Is this something that would hinder performance in the long-term?
If yes, how would you recommend to do it instead?

3. Accessing properties through self or through the API

Performance-wise, is there any difference between the following two approaches?

health = self.health
self.health = health + 10
health = go.get('#', 'health')
go.set('#', 'health', health + 10)

(assume for simplicity that I have pre-hashed the # and health strings)

3 Likes

You can quickly performance test this:

go.property("health", 0)

function init(self)
	local iterations = 1000000

	local t = socket.gettime()
	for i=1,iterations do
		local x = go.get_position().x
	end
	print("go.get_position().x took ", (socket.gettime() - t))
	
	local t = socket.gettime()
	for i=1,iterations do
		local x = go.get(".", "position.x")
	end
	print("go.get() position.x took ", (socket.gettime() - t))

	local t = socket.gettime()
	for i=1,iterations do
		local health = self.health
		self.health = health + 10
	end
	print("self.health took ", (socket.gettime() - t))

	local t = socket.gettime()
	for i=1,iterations do
		local health = go.get('#', 'health')
		go.set('#', 'health', health + 10)
	end
	print("go.get() and set() health took ", (socket.gettime() - t))
end
DEBUG:SCRIPT: go.get_position().x took 	0.13211512565613
DEBUG:SCRIPT: go.get() position.x took 	0.38671708106995
DEBUG:SCRIPT: self.health took 	0.055711030960083
DEBUG:SCRIPT: go.get() and set() health took 	0.89836716651917
5 Likes

I guess I was mostly hoping for a discussion on the why. It would be good to know the reasons behind performance differences, and to better understand why they occur from the engine’s perspective.
I can’t always measure every single instance where I decide to use position.x instead of get_position(), but it would still be nice to know which of the two is better for a given situation based on an understanding of what the engine would do…


Interesting thing is that with your measuring method, by pre-hashing the strings., #, health and position.x, the final times were somehow slower than the variant without pre-hashing! Even though the general advice is that storing the pre-hashed ids and urls is good practice, right?

But a problem with making measurements like this by summing continuous for-loops, is that it is very prone to interference both by the OS, and also most importantly from the state of the cache. If you access a thing once it may take some time, but if you continue accessing the same thing repeatedly with no other code in between those subsequent accesses will be orders of magnitude faster.

So I tried to reduce the influence of the cache by making fewer measurements at a time, but averaged over multiple update() loops. These were my results:

DEBUG:SCRIPT: === WITH PREHASHING ===
DEBUG:SCRIPT: go.get_position().x took 	2.2255428253658e-07
DEBUG:SCRIPT: go.get() position.x took 	6.4795357840402e-07
DEBUG:SCRIPT: self.health took 	1.0122067083127e-07
DEBUG:SCRIPT: go.get() and set() health took 	1.5873379177517e-06
DEBUG:SCRIPT: === NO PREHASHING ===
DEBUG:SCRIPT: go.get_position().x took 	0.28761606367808
DEBUG:SCRIPT: go.get() position.x took 	0.26440494274967
DEBUG:SCRIPT: self.health took 	0
DEBUG:SCRIPT: go.get() and set() health took 	2.1192762586806

Based on this, question 3 seems obvious, accessing the property is much faster directly, even with the id pre-hashed.

However, the get_position() behavior seems a bit peculiar. I was expecting it to be slower than go.get().
Can the engine detect when a vector3 object has been discarded by the user and then re-use it seamlessly? Or is it because even by reading the position.x property the engine is still forced to allocate a vector3 object?

For question 2 I guess I should try to measure what would happen if I moved all the extra data off of scripts and put them into some global module with lookup tables?

Well, not so surprising. go.get() is a very generic function to get any property of any component:

The go.get_position() function on the other hand is tailored for a specific usecase: getting the position of a game object, and as such it will be faster:

2 Likes

Oh wow I see… Thanks a lot :slight_smile:

And what about self being many times faster than go.get / go.set?

That’s basically only a metatable __index and __newindex lookup:

Hey,
if you’re worried about allocate a lot of new vector3 objects you can write/use simple NE for get position/scale/etc. to the existing vector.

Like this: https://github.com/indiesoftby/defold-scene3d/blob/main/scene3d/src/extension.cpp#L52

And Math Library without allocations: https://github.com/thejustinwalsh/defold-xmath

9 Likes