I am by no means an expert in this field, but sometimes I think about performance. Here is some thoughts about draw calls in Defold.
Draw Calls
What is a draw call?
In a one liner: “A command that tells the GPU to render a certain set of vertices as triangles with a certain state (shaders, blend state and so on). A command like this is often more than one object.”
A batch is a term that is basically means multiple commands that share the same state.
If you are curios of a more in depth explanation of rendering stuff I recommend Simon Trümpler videos and write up “Render Hell” here http://simonschreibt.de/gat/renderhell/
Collections and Game Objects
In world space rendering is based on Z-order and objects will batch together if they share the same state. These things will change the state on your objects:
• Component types (sprite, spine, particle)
• Different materials
• Different Atlas/textures (or absence thereof)
• Tinting (because this is made in the shader as vertex color)
There are also some other things that will or can break our batches if we are not careful, if you have multiple game objects on the same depth but with different component types, Defold will group on component type and also try to arrange it such that it works optimally in terms of Z order but this could potentially break batching. It is also good praxis not to use the same Z depth because you can get z-fighting.
Components between collection proxies will not batch, that means that components that share the same states (textures and such) but are from different proxies will not be batched. However components in collections added “from file” and collection spawned from collection factories will batch with each other if they share the same states. This includes gui, if you assemble your gui with collection proxies it is important to note that they will not batch with each other, if you use this approach use
collection factories instead of proxies.
As default you can control the Z order between -1 and 1 with extremely low precision, but to avoid working with so small numbers it is recommended to copy the default.render_script and increase the projection depth.
Tintin breaks batching but components have the exact same tinting will batch.
GUI
If you want more information read the Defold documentation. In short batches changes on: node type, blend mode, texture, font or stencil (clipping mode) settings. Objects in the GUI will only be batched together if they are in a sequence to each other. Meaning that they need to be next to each other in the node list (outliner).
What this means is that if you have a node list that have only three nodes “Box - Text - Box”, the text nodes will break the batch even though all the box nodes uses the same texture and blend mode. We can fix this by using layers.
GUI nodes between scenes DOES NOT batch. Meaning that if you have two different scenes and import them into a collection the nodes between the scenes will not batch. It could therefore be preferable to import a lot of scenes as templates into a parent scene to make them batch.
Box nodes (sprites) are rendered even if they do not have a texture, got alpha 0 or size 0,0,0. They should always have a texture assigned. If you must have an empty node, then set the size to 0, 0, 0 and assign a texture to it.
As mentioned before of the properties Blend Mode and Clipping Mode will break the batching, different node types will also break the batching. This means that it is safe to use colouring and slice 9 and other such properties. Even though tinting breaks batching on sprites, tinting sprites in gui will not break batching.
Layers
Layers are rendered in the same logic order as the node list, things are rendered top-down. The first thing will be rendered first then the second and so on. This means that the first element will be furthest back. One of the most common thing that breaks a batch is changing atlas/texture or font I therefore name them to what the atlas or font is called, this way if I use the gui scene as a template I know that if the parent scene have similar atlases they will easily combine. Speaking of templates, when you import a Template into a GUI scene the Layer Nodes/entries doesn’t follow. This means that you also have to add the layers into the parent scene.
When you have started with layers, you can’t just stop doing it. If you add an object into the scene that doesn’t have a layer it will not behave as everything else, it will belong to the “null” layer and will be drawn before everything else. The render order is the layers, and within the layers it is the hierarchy. This makes the node list not being totally what you see is what you get, but the trade of is worth it.
Defold have a cap of 16 layers, that means that you can use 15 layers (1 is for the “null” layer). If you use more than 16 layers you will get `ERROR:GAMESYS: The layer ‘layer_name’ could not be set for the ‘node’, result: -5.
Layers with stencils
Stencils quite severely breaks batching, and should be avoided as much as possible. If you have a stencil in a template where the scene normally takes 5 draw calls, if you then use that template 5 times the draw calls will be 25 as stencils makes them not batch between scenes. Stencils behaviour with layers are odd, so I am not sure but these seems to be the rules.
The stencils layer only matter for its own children, assigning a layer (to a stencil) that is rendered after its children (so if the stencil have a layer that is further down in the list than its children) will make it render after it, as expected. But in the context of all other nodes the stencils are always rendered first (in the “null” layer) regardless of the layer assigned. This means that its parents layer (and all its parents layers) must be the “null” layer (i.e. no layer assigned). The layer order doesn’t seem to affect this behaviour at all.
All the other
Draw calls is important but there are also a few other things that are important to mention. I may cover some of this in later posts but they are still worth mentioning.
Only have things in the GUI scene that you need. If you have excessive textures, fonts or spine scenes they will still be imported into the scene even if they are not implicitly used. By being imported they will consume memory.
Overdraw is a big issue. Overdraw is often a huge impact on your performance, keep it to a minimum. Create a overdraw shader and to keep track of it, an overdraw shader is also very handy for debugging when things go missing behind other stuff.
Spine meshes are heavy. In general spines doesn’t have a bigger impact on your performance than sprites would have, that is if you are not using meshes. Meshes are, comparatively, a massive hit on performance.
Texture usage is important. You should always minimise the amount of textures you have in memory. Read more about texture management.
Always use compression. Compression decreases package size and can decrease memory usage. WebP is small on disk and still looks good, use it! If you are low on memory, use a Hardware compression - if you don’t have alpha images.
Hope it was a good read, let me know if there are any mistakes or if you have questions.
Edit 05/01/18: Updated section about layers and stencils.
Edit 01/02/17: Clarified a bit how batching between collections works. Thanks @sicher
Edit 24/07/17: Removed section about particle effect behaving differently as particles now batch