Looking for advices on how to structure my project and optimize performances

Unfortunately, it doesn’t change anything for the collection loading time (black screen)…

I’ve run some tests yesterday and this morning
1/ “Reskinned” the whole project with 16x16 tiles instead of 32x32 tiles (so the tilesource image size is reduced as well, just like the character atlas etc.)
2/ Reduced the tilemap size (until something really tiny… not satisfying in terms of gameplay but had to give it a try)
3/ Moved all potentially “heavy” stuff from init() function to update() function (1st frame)
4/ Made sure all my custom resources are loaded way before the black screen disappears (cf the debug texts on the screenshot above => everything is already set when the black screen appears)

It just didn’t make a difference… The collection still takes 6-7 sec to load (web/html5 at least… it’s almost instant on Defold), while there is almost NOTHING in it… except the tilemap and a few scripts.

Everything else is supposed to be created/spawned once the collection is loaded.

And I think it works as expected because, once the black screen is gone, you can see for like 1-2 frames the tilemap being created and the characters being spawned.

I see 2-3 remaining things that could be the source of the problem:

1/ The tilesource “dimensions” (number of tiles)
I have a feeling about this one…

=> When I switch from 32x32 tiles to a 16x16 tiles, the tilesource image was smaller BUT the tilesource had the same number of “tiles”… this may explain with the collection didn’t load faster.
=> A few weeks ago, I removed elements from the tilesource image, made it smaller, reduced the tilesource dimensions in the process… and the collection loading time was reduced by like 3-4s… At this time I thought it came from the tilesource image size, but in the end it may come from the tilesource itself ( the nb of “squares”)

So I’ll try with a smaller tilesource and… we’ll see. This is my last hope. :crossed_fingers:

2/ Monarch :sweat_smile: (I hope not)

3/ Defold (I hope even less :scream:)

Ooook, so I reduced my tilesource image (and thus my tilesource)

From this

to this
image

And the “black screen” duration also decreased from 6-7s to ~2s… I could probably reduce it even more… and more and more (until there remains almost nothing in the tilesource image)… but my main objective is reached => understand why it took soooo long to load a collection that looked empty.


1/ Did you know that the tilesource size had such an impact on the collection “black screen” loading? Isn’t it a major issue for Defold?

Or maybe tilemaps are just not supposed to be used as I did? (this is why I was asking if there were best practices related to tilesource sizes, formats, dimensions (ex: power of 2) etc.)

==> if so, I think it would be helpful for new users (or just people who never used a tilemap before) to know what tilemaps should and shouldn’t be used for. And not only the “how”.


2/ ALL my game is built around tilemaps for now (for pathfinding purposes at first, but I found other applications over time), but it looks like I can’t afford to have “large” tilesources (I didn’t even know that my tilesource was large), so…

What do you think I should do?

  • Option 1: spawn game objects based on tilemap data, instead of using tilemaps to represent everything in the game? I already do it to create my upgrade buttons, the bot formations in PvP battles and various other things… so I supposed i would be possible for the buildings etc.

  • Option 2: have several tilemaps in the same collection, but with smaller tilesources? (ex: 1 tilemap for the ground, another for the buildings, the 2nd one being 1 pixel higher than the 1st one)


What do you think?
Would you recommend a mix between game objects and tilemaps for optimization purpose?

Disclaimer: I actually have no idea if this would work, it’s just a theory. Since you have a test project set up, perhaps you could see if it works.

How about creating a dummy game object that references all the tilemaps you use in the game. Spawn this in your main menu collection using a factory set to load dynamically, but just hide the game object away (out of z range, for example). I’d load the factory after a small delay, so that the game doesn’t take a long time to launch.

I would think this should create a reference, which would then mean the tilemaps would not need to be loaded again when switching collections. Am I right, or am I fundamentally misunderstanding something here?

The drawback of course is that your game’s memory usage would go up.

2 Likes

This is really strange, I use pretty big tilesource images (also pixel art, but larger than you shown in the last post) and many other things, atlases, audio and loading times are really fast e.g. Blink-of-an-eye - 2 sec. Maybe you could also measure your init functions and print it out? Maybe there are some bottlenecks? You can use os.clock() at the beginning and end of any init() function and inspect the times :wink:

1 Like

Thank you for your answers!

To be honest, I’ve been struggling with various performance issues for a while, and since the main ones are more or less fixed… I need to work on more fun stuff in the short term :grin: (like Spine animations or asynchronous multiplayer)


@Alex_8BitSkull:
When I’ll get back to the “performance optimization” topic, I’ll give your theory a try :slight_smile: (but maybe @britzl and @Mathias_Westerdahl already know if it could work?)


@Pawel:
Does your game instant load on html5/mobile? (I’m talking about the collection loading (black screen), not the initial loading that occurs in html5 builds, with the “Defold” logo)

How big is your tilesource image?

Do you use Monarch to load screens?

Anyway, I’ve added os.clock() in all my script init() (but at first sight I don’t see anything extravagant), thanks for the tips, these functions will be helpful to keep an eye on any potentially heavy function / code.

Btw, if you prepare an example, we can perhaps take a look at it.

Also, what computer specs are you using when running this HTML5 game?
What browser?

Have you tried bundling the game, and uploading it to something like itch.io just for testing?

1 Like

@britzl @Mathias_Westerdahl

Before sharing anything and making you waste time, I wanted to run some tests by monitoring script loading times with os.clock etc.

Until now I was just printing them in the console, so I never really realized there was such a huuuuuuuuuge difference between Defold and HTML5:

Defold
image

HTML5
image

Looks like something’s really wrong (almost weird) here… I’ll have to investigate deeper in my tilemaps script, isolate functions/blocks, since some of them are probably heavier than they should be :sweat_smile: . . .

Note: the bigger the tilesource / tilesource image, the longer the collection loading time :thinking:




Just a bit of context (maybe it’ll help you identify something)

scene_controller (update)
This is where I dynamically load the tilemaps data (.lua files), with sys.load_resource etc.

tilemaps (init) - aka the “problematic” script
This is where I turn the .lua data into readable tables, where I build my data tables, pathfinding grids etc… Well, everything related to tilemaps and pathfinding. I use lua modules to share some of these tables/grids.




I would have 2 questions:

1/ Is it normal to have such a difference between “local” (Defold) and HTML5 in terms of collection loading times? (the screen_controller update is even faster on HTML5… while the tilemap scripts seems to take an eternity)

2/ What is an acceptable value for this collection loading time? I set the “warning threshold” to 0.2s but I really don’t know.

Hard to say. Running Defold in a browser will be slower. No question about it.

This can be really time consuming if not done correctly. Pay attention to your code and make sure to not create and dispose of many short lived Lua tables for instance.

Also remember that in HTML5 builds we use the slower standard Lua 5.1 and not the much faster LuaJIT

Thank you for your answer!

Agree, some parts of my code are not above suspicion… cf below :slight_smile:


This morning I finally isolated the small code portion that was problematic (in the tilemaps script), and really… this is just ONE (old) function but with tons of loops :slight_smile:

I created this function months ago and used it to convert the raw tilemap data into an “easy-to-use” 2-dimension tables (per layer). Super useful, but probably not elegant/optimized.

(I have a question about all these loops at the end but maybe you’ll already see obvious bad practices at this point?)



Anyway, this morning I’ve been proceeding with easy adjustments and changed the loading times from 7.5s to 1.8s (html5)

1/ I replaced the remaining tablelength() functions (from the time I was a complete beginner with Defold / Lua… :see_no_evil:) with the # operator (#lua_tileset_table replaced tablelength(lua_tileset_table)
=> huge gain (like -50% loading times… wow)

2/ I removed some layers from my tilemap (keeping the essential ones only)… to reduce the number of loops.

From
image To image

This second adjustement is like a sacrifice, I must say :disappointed_relieved: but for now, more layers also mean longer loading times, so…

At this point, even though I’d prefer to have instant collection loading times :grin: the result is acceptable.



The 4 loops seem to be the core problem, so maybe I’ll have to rethink the way I read and “reformat” the tilemap data… I don’t know.

But I’m wondering… Is there a way to optimize the loops themselves? (in terms of pure code)

I have 4 “for” inside each other but I don’t know if there is a more elegant/optimized way than just a brutal “for for for for”…

Generally, this:

for i=1,#t do

Is slower than this:

local num = #t
for i=1,num do

This is because you are recalculating the length of the table with each iteration.

You also want to avoid indexing a table over and over. This is bad:

table_laters_final[i].data[j][k].tile_id ...
table_laters_final[i].data[j][k].walkable ...

Better to prepare a variable like this:

local tile = table_laters_final[i].data[j][k]
tile.tile_id ...
tile.walkable ...

There are probably more issues that I’m not thinking of at the moment. Check out this excellent thread by @dlannan :

4 Likes

This will give a very significant performance boost!

I’m also curious to learn why you are calling tonumber() on every value? What is in that lua_tileset_table? Why can’t you convert to numbers once and outside of that inner loop?

3 Likes

Made a little test:

    --SETUP
    local num = 200
	local t = {}
	for i=1,num do
		t[i] = {}
		for j=1,num do
			t[i][j] = {}
			for k=1,num do
				t[i][j][k] = math.random()
			end
		end
	end

    --SLOW
	local slow = socket.gettime()
	for i=1,num do
		for j=1,num do
			for k=1,num do
				t[i][j][k] = t[i][j][k] + 1
			end
		end
	end
	slow = socket.gettime() - slow

    --FAST
	local ti, tj, tk
	local fast = socket.gettime()
	for i=1,num do
		ti = t[i]
		for j=1,num do
			tj = ti[j]
			for k=1,num do
				tk = tj[k]
				tk = tk + 1
			end
		end
	end
	fast = socket.gettime() - fast

	print("Slow: ", math.floor(slow*10000)/10000 .. "ms", "Fast: ", math.floor(fast*10000)/10000 .. "ms", "Fast/Slow: ", math.floor(fast/slow*10000)/100 .. "%")
Slow: 0.968ms Fast: 0.289ms Fast/Slow: 29.85% 

Run it yourself here.

5 Likes

Wow, thank you for your answers! Very interesting and helpful.

Looks like there is a lot of room for improvement in my code, not only in this function.

This one alone reduces the collection loading time by 10-15% :ok_hand:

No particular reason… just because it worked that way. I was already very happy the whole thing worked as intended (except the “perf optimization” part which was not a topic at this time).

I’ll get it out of the inner loop!

Wow this one is insane :open_mouth:

I just tried to apply this logics to my table structures, and… didn’t succeed yet :no_mouth: But I will give it another try as soon as I can!

Anyway, thank you again for these tips and best practices!

3 Likes

Well, if you can get rid of it in the loop it will be 9 less function calls each iteration. If the input is already a number of will be relatively fast, but if the inputs are indeed strings that you convert to numbers each iteration it can mean a significant performance improvement to remove them!

1 Like

I’m not sure what your script does, but I’d look into doing most of this offline, when building content.

Produce a data file you can read as-is and requires no processing at runtime.

6 Likes