A few performance-related questions

Haath · April 6, 2023, 8:15am

Hey guys, apologies in advance if these have been answered before. I did search the forum, and some of these topics have been discussed briefly in the past, but I wasn’t able to find the answers I was looking for.

1. Using `go.get()` instead of `go.get_position()`

If I understand correctly, the following has some performance implications since it needs to allocate a new vector3 object.

pos = go.get_position(".")

So do I understand correctly that the following would be faster?

pos_x = go.get(".", "position.x")
pos_y = go.get(".", "position.y")

Ditto for using go.set(), instead of go.set_position().

2. Storing non-property objects on `self`?

Suppose I want to store data on a script that don’t qualify to be Defold properties (e.g lists, strings, tables).
So far what I’ve been doing is something like the following:

go.property("health", 100)

function init(self)
    self.some_list = [ ... ]
    self.some_string = "hello world"
end

Sure I can’t access these custom properties using go.set() or go.get(), but I can access them from the other callbacks in the script and that’s fine for me.

Is this something that would hinder performance in the long-term?
If yes, how would you recommend to do it instead?

3. Accessing properties through `self` or through the API

Performance-wise, is there any difference between the following two approaches?

health = self.health
self.health = health + 10

health = go.get('#', 'health')
go.set('#', 'health', health + 10)

(assume for simplicity that I have pre-hashed the # and health strings)

britzl · April 6, 2023, 8:32am

You can quickly performance test this:

go.property("health", 0)

function init(self)
	local iterations = 1000000

	local t = socket.gettime()
	for i=1,iterations do
		local x = go.get_position().x
	end
	print("go.get_position().x took ", (socket.gettime() - t))
	
	local t = socket.gettime()
	for i=1,iterations do
		local x = go.get(".", "position.x")
	end
	print("go.get() position.x took ", (socket.gettime() - t))

	local t = socket.gettime()
	for i=1,iterations do
		local health = self.health
		self.health = health + 10
	end
	print("self.health took ", (socket.gettime() - t))

	local t = socket.gettime()
	for i=1,iterations do
		local health = go.get('#', 'health')
		go.set('#', 'health', health + 10)
	end
	print("go.get() and set() health took ", (socket.gettime() - t))
end

DEBUG:SCRIPT: go.get_position().x took 	0.13211512565613
DEBUG:SCRIPT: go.get() position.x took 	0.38671708106995
DEBUG:SCRIPT: self.health took 	0.055711030960083
DEBUG:SCRIPT: go.get() and set() health took 	0.89836716651917

Haath · April 6, 2023, 10:49am

I guess I was mostly hoping for a discussion on the why. It would be good to know the reasons behind performance differences, and to better understand why they occur from the engine’s perspective.
I can’t always measure every single instance where I decide to use position.x instead of get_position(), but it would still be nice to know which of the two is better for a given situation based on an understanding of what the engine would do…

Interesting thing is that with your measuring method, by pre-hashing the strings., #, health and position.x, the final times were somehow slower than the variant without pre-hashing! Even though the general advice is that storing the pre-hashed ids and urls is good practice, right?

But a problem with making measurements like this by summing continuous for-loops, is that it is very prone to interference both by the OS, and also most importantly from the state of the cache. If you access a thing once it may take some time, but if you continue accessing the same thing repeatedly with no other code in between those subsequent accesses will be orders of magnitude faster.

So I tried to reduce the influence of the cache by making fewer measurements at a time, but averaged over multiple update() loops. These were my results:

DEBUG:SCRIPT: === WITH PREHASHING ===
DEBUG:SCRIPT: go.get_position().x took 	2.2255428253658e-07
DEBUG:SCRIPT: go.get() position.x took 	6.4795357840402e-07
DEBUG:SCRIPT: self.health took 	1.0122067083127e-07
DEBUG:SCRIPT: go.get() and set() health took 	1.5873379177517e-06
DEBUG:SCRIPT: === NO PREHASHING ===
DEBUG:SCRIPT: go.get_position().x took 	0.28761606367808
DEBUG:SCRIPT: go.get() position.x took 	0.26440494274967
DEBUG:SCRIPT: self.health took 	0
DEBUG:SCRIPT: go.get() and set() health took 	2.1192762586806

Based on this, question 3 seems obvious, accessing the property is much faster directly, even with the id pre-hashed.

However, the get_position() behavior seems a bit peculiar. I was expecting it to be slower than go.get().
Can the engine detect when a vector3 object has been discarded by the user and then re-use it seamlessly? Or is it because even by reading the position.x property the engine is still forced to allocate a vector3 object?

For question 2 I guess I should try to measure what would happen if I moved all the extra data off of scripts and put them into some global module with lookup tables?

britzl · April 6, 2023, 12:51pm

Well, not so surprising. go.get() is a very generic function to get any property of any component:

github.com/defold/defold

engine/gameobject/src/gameobject/gameobject_script.cpp

dev


      
           */
          int Script_Get(lua_State* L)
          {
              ScriptInstance* i = ScriptInstance_Check(L);
              Instance* instance = i->m_Instance;
              dmMessage::URL sender;
              dmScript::GetURL(L, &sender);
              dmMessage::URL target;
              dmScript::ResolveURL(L, 1, &target, &sender);
              DM_HASH_REVERSE_MEM(hash_ctx, 256);
              if (target.m_Socket != dmGameObject::GetMessageSocket(i->m_Instance->m_Collection->m_HCollection))
              {
                  return luaL_error(L, "go.get can only access instances within the same collection.");
              }
              dmhash_t property_id = 0;
              if (lua_isstring(L, 2))
              {
                  property_id = dmHashString64(lua_tostring(L, 2));
              }
              else

This file has been truncated. show original

The go.get_position() function on the other hand is tailored for a specific usecase: getting the position of a game object, and as such it will be faster:

github.com/defold/defold

engine/gameobject/src/gameobject/gameobject_script.cpp

dev


      
          * ```
          *
          * Get the scale of another game object instance with id "x":
          *
          * ```lua
          * local s = go.get_scale("x")

Haath · April 6, 2023, 1:24pm

Oh wow I see… Thanks a lot

Pawel · April 6, 2023, 6:57pm

And what about self being many times faster than go.get / go.set?

britzl · April 6, 2023, 8:44pm

That’s basically only a metatable __index and __newindex lookup:

github.com

defold/defold/blob/dev/engine/gameobject/src/gameobject/gameobject_script.cpp#L244-L274

    
      
          static int ScriptInstance_index(lua_State *L)
          {
              ScriptInstance* i = (ScriptInstance*)lua_touserdata(L, 1);
              (void) i;
              assert(i);
          
          
    // Try to find value in instance data
              lua_rawgeti(L, LUA_REGISTRYINDEX, i->m_ScriptDataReference);
              lua_pushvalue(L, 2);
              lua_gettable(L, -2);
              return 1;
          }
          
          
static int ScriptInstance_newindex(lua_State *L)
          {
              int top = lua_gettop(L);
          
          
    ScriptInstance* i = (ScriptInstance*)lua_touserdata(L, 1);
              (void) i;
              assert(i);

This file has been truncated. show original

Dragosha · April 7, 2023, 5:48am

Hey,
if you’re worried about allocate a lot of new vector3 objects you can write/use simple NE for get position/scale/etc. to the existing vector.

Like this: https://github.com/indiesoftby/defold-scene3d/blob/main/scene3d/src/extension.cpp#L52

And Math Library without allocations: https://github.com/thejustinwalsh/defold-xmath

A few performance-related questions

1. Using go.get() instead of go.get_position()

2. Storing non-property objects on self?

3. Accessing properties through self or through the API

1. Using `go.get()` instead of `go.get_position()`

2. Storing non-property objects on `self`?

3. Accessing properties through `self` or through the API