i am experimenting with greedy meshing in a voxel engine
one issue i noticed is that in closed spaces where lighting is uniform, greedy meshing generates very large faces as expected because all blocks have the same light value and texture. however when blocks have different light or ambient occlusion values it does not merge faces even when the texture is the same. you can see the issue here https://www.reddit.com/r/VoxelGameDev/comments/1rks9nt/my_greddy_mesh_divide_on_ambient_occlusion_and/
i saw some voxel engines solve this by:
- using separate buffers for per-block lighting
- gpu quad buffers with vertex pulling
- rle encoded block data read in the fragment shader
- or baking lighting into textures
however defold seems to have limited gpu buffer access compared to many opengl based engines.
so i am wondering:
is there any way in defold materials to access additional gpu buffers besides vertex attributes and uniforms?
for example something similar to:
- structured buffers / ssbo
- per-instance buffers
- or custom vertex pulling Vertex Pulling - Voxel.Wiki
i also considered storing per-block lighting data in uniforms. for example a 16×16×16 chunk would need about 4096 values. if each value was 4 bytes that would be around 16 kb per chunk which sounds reasonable at first.
however uniforms have several limitations. uniform buffers are usually limited to around ~64 kb per shader stage depending on the gpu and driver. they also are not really designed for large randomly indexed datasets. while you technically can index uniform arrays they are intended more for small constant data rather than large per-block buffers.
another issue is that if i stored lighting in uniforms per chunk i would likely need a separate material or uniform update for each chunk. even if the size stays under the ~64 kb limit that would increase draw calls and uniform uploads significantly once many chunks are visible.
another idea i considered is gpu instancing. for example rendering cubes using instancing and passing per-cube data through an instance buffer. this would allow rendering many cubes in one draw call without greedy meshing. however it would also increase vertex count significantly since every cube face would exist again instead of merged quads. LearnOpenGL - Instancing
so i am trying to understand if defold offers anything closer to gpu storage buffers like ssbo, structured buffers, or vertex pulling style data access. or if the realistic approach in defold is simply to keep lighting baked into vertex attributes and limit greedy quad sizes.