[SOLVED] Optimizing frame stability / Avoiding garbage collector-related hitches

cornedbeer · May 18, 2022, 3:50pm

Yeah, my initial go at tackling this problem was just to run a step every frame, but I wasn’t really able to find a good balance of step size to stay on top of memory accumulation, while ensuring heavy frames were never too taxed by it. Additionally I really wanted to find a solution that isn’t so framerate-agnostic, because it’s frustrating seeing frame-dips caused by garbage collection surrounded by frames with several ms to spare, knowing there’s a lot of performance that could be squeezed out of them.

After some more experimentation the past day, I found a solution that seems to be working pretty well! I was on the right track before, but I think I may have been running into some approximation errors via os.cock() that meant I couldn’t find a balance of a tight enough garbage collection limit to keep the framerate from dipping below 60fps on otherwise busier frames, while keeping it broad enough to stay on top of the memory accumulation of these heavy frames. Additionally, running this sort of loop in a render script seems very unstable after I did more testing with it; I would seemingly randomly get crashes, I think when the loop made the render script’s update take too long, perhaps triggering some sort of failsafe.

To counteract these issues, I moved the garbage collector looping to my main loader script, which always runs after other updates, with the exception of the render script (The render script is fairly stable, and obviously needs to do the same thing every frame, so I feel comfortable just reserving a constant number of ms for it) and, most importantly, structured my code so that it would cycle between heavy processing frames (for now basically just the enemy pathfinding), and garbage collection frames.

The basic flow of it is:
enemy needs path → sends a request to the main controller where it is put on a queue

on controller update, if not shared.pathfinding_sent → send a message to the front of the queue, removing that element and setting the shared.pathfinding_sent flag

enemy receives the response to find a path and is given the sole token to process pathfinding → sets shared.block_collection flag; starts the pathfinding process, which is split into a coroutine-esque structure to run over multiple frames if need be, and to never consume more than ~3ms of a frame

if an enemy has the token on its update loop → continue the pathfinding, unless the process is complete, in which case give up the token and unset the shared.block_collection flag

Then, in the loader script (the outer level bootstrap collection that loads the actual scenes of the game):

-- toggle between intensive processing in enemy AI, and frames reserved for garbage collection
	if not shared.block_collection then 
		
		local start_time = shared.start_time
		if start_time then
			local diff = os_clock() - start_time
			local done
			
			-- step through garbage collection cycle until it is done or the frame is approaching 60fps
			while not done and fps_target - diff > collection_limit do
				done = collectgarbage("step", 1)
				
				local check = os_clock()
				diff = check - start_time
			end
			if done then
				-- could be a more generic state, but for now pathfinding is the only thing to alternate with collection
				shared.sent_pathfinding = nil 
			end
			
		end
		-- have to stop garbage collector again after stepping it, to avoid automatic collection cycles
		collectgarbage("stop")

The start_time is set from os.clock() at the start of the level controller’s update, which is the first update of the frame. When the garbage collector is freed (an enemy has finished the pathfinding cycle),
while garbage collector hasn’t returned a finished cycle, as long as 60fps - the time elapsed so far this frame doesn’t exceed the threshold set to reserve some time for other processing every frame, take a garbage collector step. Once collector is done, unset the shared.sent_pathfinding flag, so the next enemy on the queue (if it needs one) can be sent a pathfinding request and the cycle starts again.

Doing it this way has allowed me to stay on top of memory accumulation while never dropping below 60fps in my stress test, so I feel pretty confident in the performance moving forward. Hopefully my explanation makes sense and can be of some help to people.

TL;DR
If you want to get the most out of each frame, in a game that has to be smooth and responsive while having some pretty processing-intensive features, I would recommend doing some manual memory management. Also think about ways you can structure your code where any processing that isn’t critical to run EVERY frame is split up to run over multiple frames, and staggered by frames that are reserved for garbage collection (and of course any basic stuff that has to happen every frame).

Lastly, the first advice anyone will give, but it is still relevant, is to try to minimize the amount of junk you accumulate every frame to begin with, to lighten the load this manual collection has to do. It’s a good habit to get into to avoid making a bunch of local data in loops, such as tables, vectors, functions, and to reuse these whenever possible, clearing or resetting the values rather than creating a brand new table, and never declaring a function within a loop when you can avoid it.