To hash or not to hash (SOLVED)

Much of what happens within Defold is driven via its asynchronous message loop. Some of the messages can be quite frequent - e.g. trigger_response, animation_done etc. Given that message dispatch requires the incoming message_id to be compared against a hashed string value, e.g. hash(trigger_response) would there be any mileage in pre-calculating the hashes rather than doing hash(ā€¦) each time?

Once again, this question follows on from a previous one and my penchant for avoiding globals, orphaned constants, repeated calculations etc in my code.

As @britzl has indicated in his response to this question, Defold likes to know as much as possible about the game in advance in order to optimize its compilation. If everything is being done in that same vein it is not inconceivable that the Defold compiler is detecting message_id == ā€œstringā€ comparisions and silently replacing them with message_id == pre_calcuated_hash anyway in which case my own efforts in that direction are superfluous?

1 Like

We currently donā€™t do anything special with those message_id == hash("foo") cases.
And I donā€™t think weā€™ll do it in any foreseeable time tbh.

In other words

  1. Yes, you do need to compare message_id with a hash
  2. If you calculate the hash in place at the time of each comparison you are not going to do much better than if you were comparing with a pre-calculated hash

?

Message id is a hash(), yes. So you need to compare it to a hash.

What I meant is that thereā€™s no magic there. As youā€™d expect, hash(ā€œfooā€) is a function call.

Hash() is really fast, however it is even better to pre-hash - you can pre-hash all values that you would like to use in the module and then require it whenever needed or make a global access, take a look here: :wink:

Iā€™m using such module and whenever, it has also the advantage that you can change each value only in one module and you have all those values of course listed in one convenient place. Iā€™m required a module as a global and Iā€™m using everywhere m.ENABLE, m.START, m.ATTACK, etc. :wink:

3 Likes

Pre-hashing of commonly used strings is recommended and storing them in a Lua module as @Pawel suggest will result in an improvement, but with a trade-off that you need to do an index lookup in the module:

local t = {
	FOOBAR = hash("foobar")
}

local FOOBAR = hash("foobar")


if message_id == hash("foobar") then
	print("fast")
elseif message_id == t.FOOBAR then
	print("faster")
elseif message_id == FOOBAR then
	print("fastest")
end

Itā€™s all a bit theoretical as you have to perform many comparisons for this to be real problem. But getting used to good practices never hurt!

5 Likes

Thanks. Premature optimization is the root of all evil said someone much wiser than I. My intent here is not so much to optimize as to avoid code carelessly sprinkled with a multitude of local constants with all the problems that engenders.

It also makes it a whole lot easier to make global alterations to the code. A case in point - acting on @britzl 's suggestion I dropped trapping the collision_response event and switched to trigger_response instead. Given that my comparisons were being done against a pre-defined string in a Lua module the switch involved about 0.5s of work.

Doing such things in the vain pursuit of saving off a few nano seconds here or there is, needless to say, an utterly futile exercise

1 Like

Iā€™ve tested this. It can make a huge difference if you have a lot of messages. For most projects itā€™s not enough to matter, but if you get a lot of messages being bounced around every frame, itā€™s noticeable. Keep in mind that thereā€™s no way to unsubscribe from messages, so even if you donā€™t care about a message, your scriptā€™s on_message function will still be called and your hashes still evaluated. Collision messages are the main culprit here, since you can get multiple messages per frame for each object. Try making a ā€œball poolā€ with physics objects and a script on each ballā€¦youā€™ll want to pre-hash.

I would call it ā€œgood practiceā€ more than ā€œpremature optimizationā€, since itā€™s 100% sure that it is faster to pre-hash them. Having them as local variables rather than looking them up in a table is also definitely faster, though itā€™s a pretty tiny difference (much less than calling hash()). Especially on mobile and web, every bit counts.

I say: Donā€™t bother much about it when prototyping, but do it with all your scripts before you release your game.


Isnā€™t the whole point of local variables that they can only be used in the current scope, so they canā€™t cause problems?


ā€œPremature organization is the root of all evil.ā€
-Ross

3 Likes

Thank you for sharing the results of your testing. Very useful.

I was referring to constant as opposed to variable values. Think of strings such as collision_response or game dependent constant values such as, say, MAX_SPEED. So much easier to make one global change than to have string/numeric values scattered across half a dozen .script files.

I prefer to think in terms of early organization as opposed to premature organization - which in my case at least is preceded by a phase of organized chaos whilst I thrash about for an idea I find convincing. I doubt that my authority and stature - here or anywhere else, for that matter - give me the privilege to start coining aphorisms but I will have a go at it anyway

Early organization lays the foundations for easy future reorganization :blush: