I really appreciate this. I’m a huge fan of making retro-like games and utilising the GPU to re-create retro effects, so so we can avoid taxing the CPU and making our games run slower for an effect that actually was very efficient to do on the original hardware. I believe this is exactly the task the GPU is meant for in this case - pretending to be the graphics chip in the old video game systems.
I’m also a fan of that article you linked to, it seems like the most efficient way to recreate this effect.
And the fact you even included a python script to convert the images and create the palettes really makes this a complete “paletted workflow” for Defold.
Yes, but breaking draw call batching between identical sprites that use a different palette index is very much a worthwhile trade off of time and effort IMO.
I have once gone deeper into this (using Kha, and with the help of the friendly team there, especially it’s creator Rob), and went the whole way of making sure each quad can be batched even if it uses a different palette index, by including the palette index in the vertex data. You end up including this value in all 4 verts but it’s the most hardware friendly way to do it. But this requires you to go so low level that it circumvents the normal material workflow usually set up in game engines. Eg. in Unity you’d have to make your own vertex buffer format. Getting this deep into the weeds means you miss out out on using a built in feature of the game engine to save you time.
And I believe for not much benefit too - it would be nice to think that all sprites that use the same texture are being batched even if they have different palettes, but what kind of performance are we talking about saving here? It’s a 2D game, even if you have hundreds, or even thousands of sprites and only the ones with the same palette can be batched, that performance hit could not be anything of note, it will already have made many more draw calls than that just to draw all the different kinds of other things in the game.
With this shader technique, we have gone as far as we need to, we’ve saved ourselves and the computer practically all of the work that matters. We have used shaders to achieve the result, and avoided having to make some other cumbersome approach that would require making multiple copies of the same texture and then a system to manage them, and potentially become an actual RAM hog problem on something like the Nintendo Switch.