Draw calls and Defold


#1

I am by no means an expert in this field, but sometimes I think about performance. Here is some thoughts about draw calls in Defold.

Draw Calls

What is a draw call?

In a one liner: “A command that tells the GPU to render a certain set of vertices as triangles with a certain state (shaders, blend state and so on). A command like this is often more than one object.”

A batch is a term that is basically means multiple commands that share the same state.

If you are curios of a more in depth explanation of rendering stuff I recommend Simon Trümpler videos and write up “Render Hell” here http://simonschreibt.de/gat/renderhell/

Collections and Game Objects

In world space rendering is based on Z-order and objects will batch together if they share the same state. These things will change the state on your objects:

• Component types (sprite, spine, particle)
• Different materials
• Different Atlas/textures (or absence thereof)
• Tinting (because this is made in the shader as vertex color)

There are also some other things that will or can break our batches if we are not careful, if you have multiple game objects on the same depth but with different component types, Defold will group on component type and also try to arrange it such that it works optimally in terms of Z order but this could potentially break batching. It is also good praxis not to use the same Z depth because you can get z-fighting.

Components between collection proxies will not batch, that means that components that share the same states (textures and such) but are from different proxies will not be batched. However components in collections added “from file” and collection spawned from collection factories will batch with each other if they share the same states. This includes gui, if you assemble your gui with collection proxies it is important to note that they will not batch with each other, if you use this approach use
collection factories instead of proxies.

As default you can control the Z order between -1 and 1 with extremely low precision, but to avoid working with so small numbers it is recommended to copy the default.render_script and increase the projection depth.

Tintin breaks batching but components have the exact same tinting will batch.

GUI

If you want more information read the Defold documentation. In short batches changes on: node type, blend mode, texture, font or stencil (clipping mode) settings. Objects in the GUI will only be batched together if they are in a sequence to each other. Meaning that they need to be next to each other in the node list (outliner).
What this means is that if you have a node list that have only three nodes “Box - Text - Box”, the text nodes will break the batch even though all the box nodes uses the same texture and blend mode. We can fix this by using layers.

GUI nodes between scenes DOES NOT batch. Meaning that if you have two different scenes and import them into a collection the nodes between the scenes will not batch. It could therefore be preferable to import a lot of scenes as templates into a parent scene to make them batch.

Box nodes (sprites) are rendered even if they do not have a texture, got alpha 0 or size 0,0,0. They should always have a texture assigned. If you must have an empty node, then set the size to 0, 0, 0 and assign a texture to it.

As mentioned before of the properties Blend Mode and Clipping Mode will break the batching, different node types will also break the batching. This means that it is safe to use colouring and slice 9 and other such properties. Even though tinting breaks batching on sprites, tinting sprites in gui will not break batching.

Layers

Layers are rendered in the same logic order as the node list, things are rendered top-down. The first thing will be rendered first then the second and so on. This means that the first element will be furthest back. One of the most common thing that breaks a batch is changing atlas/texture or font I therefore name them to what the atlas or font is called, this way if I use the gui scene as a template I know that if the parent scene have similar atlases they will easily combine. Speaking of templates, when you import a Template into a GUI scene the Layer Nodes/entries doesn’t follow. This means that you also have to add the layers into the parent scene.

When you have started with layers, you can’t just stop doing it. If you add an object into the scene that doesn’t have a layer it will not behave as everything else, it will belong to the “null” layer and will be drawn before everything else. The render order is the layers, and within the layers it is the hierarchy. This makes the node list not being totally what you see is what you get, but the trade of is worth it.

Defold have a cap of 16 layers, that means that you can use 15 layers (1 is for the “null” layer). If you use more than 16 layers you will get `ERROR:GAMESYS: The layer ‘layer_name’ could not be set for the ‘node’, result: -5.

Layers with stencils

Stencils quite severely breaks batching, and should be avoided as much as possible. If you have a stencil in a template where the scene normally takes 5 draw calls, if you then use that template 5 times the draw calls will be 25 as stencils makes them not batch between scenes. Stencils behaviour with layers are odd, so I am not sure but these seems to be the rules.

The stencils layer only matter for its own children, assigning a layer (to a stencil) that is rendered after its children (so if the stencil have a layer that is further down in the list than its children) will make it render after it, as expected. But in the context of all other nodes the stencils are always rendered first (in the “null” layer) regardless of the layer assigned. This means that its parents layer (and all its parents layers) must be the “null” layer (i.e. no layer assigned). The layer order doesn’t seem to affect this behaviour at all.

All the other

Draw calls is important but there are also a few other things that are important to mention. I may cover some of this in later posts but they are still worth mentioning.

Only have things in the GUI scene that you need. If you have excessive textures, fonts or spine scenes they will still be imported into the scene even if they are not implicitly used. By being imported they will consume memory.
Overdraw is a big issue. Overdraw is often a huge impact on your performance, keep it to a minimum. Create a overdraw shader and to keep track of it, an overdraw shader is also very handy for debugging when things go missing behind other stuff.
Spine meshes are heavy. In general spines doesn’t have a bigger impact on your performance than sprites would have, that is if you are not using meshes. Meshes are, comparatively, a massive hit on performance.
Texture usage is important. You should always minimise the amount of textures you have in memory. Read more about texture management.
Always use compression. Compression decreases package size and can decrease memory usage. WebP is small on disk and still looks good, use it! If you are low on memory, use a Hardware compression - if you don’t have alpha images.


Hope it was a good read, let me know if there are any mistakes or if you have questions.

Edit 05/01/18: Updated section about layers and stencils.
Edit 01/02/17: Clarified a bit how batching between collections works. Thanks @sicher
Edit 24/07/17: Removed section about particle effect behaving differently as particles now batch


UI Based Game
How to "merge" sprites in defold
Model and draw calls
Using Factories for GUI Collections / Screen Handling
Gui sorting order issues
Some thoughts on performance and stability and a plea for crash reports!
Performance Issue on Different Machines
How to "merge" sprites in defold
How to "merge" sprites in defold
How to "merge" sprites in defold
How textures manager works? (SOLVED)
Overflow! Each collection allocates memory by numbers from the game.project (SOVLED)
Any GameObject takes a DrawCall (SOLVED)
Can not render the font(solved)
Interpreting the visual profiler information
Bring Me Cakes. Red Riding Hood Puzzle
Transparent box nodes
White lines on screen (SOLVED)
Reactor Defence (a game for CoronaDefold game jam)
#2

Brilliant write up! Thanks for sharing Mattias!


#3

Clarification: collections do not affect batching. However, if you load a collection via proxy it will create a new world (another “main” or “root” collection). That will break batching. So two components with the same texture that are in different worlds will take one drawcall each.


#4

Brilliant post! Thank you!


#5

thank you. I’ve used render hell before and it’s really nice tutorial with cool animations.

By the way. Are you planing to add virtual container to the gui. It’s a boxes with no content. I use them to align internal elements, currently i set alpha - 0 for them.
And also related issue with transparent pixels is slice9 textures. if it’s possible to add option “do not show center area”


#6

@sicher I can’t find this :point_up: information in manuals.
Maybe has the meaning to add ? I think it would be very useful.
What do you think?


#7

Yes! I’ll make a note.


#8

This definitely cleared some things up for me, thank you!


#9

As of Defold 1.2.106 this is no longer true. Instead of issuing one drawcall per emitter, we now batch on:

  • Emitter material
  • Texture (tile source)
  • Blendmode
  • Render constants

#10

@britzl wrote a nice summery here How to "merge" sprites in defold here is a blatant copy of it!


To clarify:

Rendering of sprites, spine models, particle fx etc

  • Rendering is based on z-order, back to front.
  • Components on different depths will be batched unless one of the following is different from the previous component:
    • Component type (sprite, spine, particle, label, model)
    • Texture
    • Material
    • Blend mode, tint etc
    • Collection proxies
    • Note: Each particle emitter will result in a draw call

Rendering of gui scenes

  • Rendering is based on the order of the nodes in the outline, depth first
    • Nodes will be batched unless one of the following is different from the previous node:
    • Node type (box, text, pie)
    • Texture
    • Blend mode
    • Font
    • Stencil settings

Layers are used to group different nodes to reduce the number of draw calls. Some images:




Example of the complexity stencil add

A basic scene

Same scene with a stencil


#11

Does somebody can to explain why?


#12

In gui the color is sent as vertex attributes as opposed to uniforms (I’m quoting Sven here).


#13

Thank you, as I thought. And I have been to look into shaders before ask =)


#14

And why is that? You could have the possibility to set vertex attributes for sprites as well to be used for tinting (or other things) in the sprite shader as well.
Or do you use them internally for something?


Limit of gui nodes
#15

Good question! @sven?


#16

Could you post your overdraw shaders?


#17

If we could get the ability to set tinting / alpha with sprites in a way which didn’t break batching it would be super great.


Setting alpha of a game object's children
#18

The answer to that is customizable vertex formats, which I think is on our roadmap for this second half of this year. (@britzl knows more)


#19

The one I use are available in DefFX


#20

agree,
sprite tinting is required to create a retro lighting like the old X-Com, for example:
ufo_041