Data Oriented Programming

Pawel · September 23, 2024, 8:06pm

I’m really under vast impression of book “Data Oriented Programming” by Yehonathan Sharvit (a lot from it is available online, e.g. What is Data Oriented Programming? | Yehonathan Sharvit, but I also recommend full book) which gathers all the data related approaches (that are existing here since LISP) into one set of 4 rules:

Separate data from functionalities (behavior)
Representing data with generic structures
Treating data as immutable
Separating data schema from data representation (so dynamic typing with optional runtime checks, e.g. with JSON schema, but I use Lua table here simply)

This differs from DOD which focuses on cache misses and general performance, when thinking about data first, but data in memory. DOP on the other hand focuses on data as abstraction and on designing proper architectures basing on them. It’s language agnostic and paradigm agnostic, so can be applied (or broken) in FP or OOP.

Rules #1 is undoubtly good imho, I do it anyway as much as possibile, because I spent my whole life writing OOP and find stateful classes bugprone as hell.

Because of this rule #2 feels also good, because you can then reuse a lot of code that works with all the data (but there is a cost to it - data validation, that is addressed by rule #4)

Point number 3 is most controversial, because not all languages support it natively (but more and more do), but it can be achieved even in Lua (I made a Lua module for this, testing it and will soon open source it).

But beside such a set of rules looks very beneficial, especially for bigger architectures and data heavy games. You can store only one set of immutable data (think of it like “version”) or multiple of them (and traverse back in time, if you wish). You don’t change the data, but you create new version of data and “commit” it - the name is purposeful - imho GIT perfectly follows DOP rules and is a great example on how to think about data in DOP way. You also don’t need to mąkę copies of ALL your data only to change one field, e.g. updated player’s position or some single item in inventory - you can use something like structural sharing to only change affected data and reference to the data from previous version for the rest (those field can also reference to previous versions, etc.)

I’m testing full DOP approach for one project (in Defold, in Lua), so I can tell more after it, but so far, I’m very pleased.

Ah, and finally - ECS is one of the Architecture that perfectly fits DOP, that’s why I come up with it here. I would say it’s one of the DOP implementations that was vastly adopted in gamedev, especially for similar objects clones (aka enemies)-heavy games.

And Clojure is definitely the best for DOP, but since I still can’t come to an agreement with FP I can’t tell if Defold Editor is using approach that benefis of those rules ( and can’t tell if the graph based implementation is).

AGulev · September 23, 2024, 8:08pm

I talked with @Pawel and moved his post about DOP into a separate topic

AGulev · September 23, 2024, 9:00pm

I use a similar approach in all my games (idk, maybe 12 games or so).

I don’t have a formalized process for it, as I kind of developed it myself over time.

The closest explanation I’ve found online that aligns with my approach is this article:
Flux Architecture in Games: Porting the Web Design Pattern to Game Development.

But I heavily rely on composition to make non-monolithic logic and view.

I don’t strictly follow immutability; it’s more like having an immutable game state on the view side. Since the logic is composed of many modules, I need to mutate the state as it passes from module to module, along with the responses each module fills in.
Another key aspect is that I split the logic state from the view state. All meaningful data—everything needed to restore the game state—is kept in the logic. The view state contains the rest. This makes it easy and effortless to save the state, create duplicates for undo or prediction, etc., since it’s a relatively small amount of data.

One more important thing to mention is that I pass configs (I use different configs for the view and for the logic) and the state of the view and logic every time I call it from the controller (the entry point into the game). This means the view and logic are stateless and operate on the passed (or injected, if you will) state and are config-less, so to speak. This approach makes it possible to change configs or state at any time or run two separate views/logics with variations in representation or operation. For example, you can simultaneously show your opponent’s move or play against a bot, where you see how it plays on a separate small field with a custom representation regulated by a view config.

It’s a bit challenging to explain everything clearly. The entry barrier is not low at all, even for me, starting a new project using this approach can be difficult because a lot of decisions have to be made early on to correctly split logic from view. But once that’s done, the development process becomes smooth and fast:

It scales easily.
Adding new features is straightforward.
Changing the game and its rules can often be done by tweaking configurations.
Testing and balancing become simpler.
State-related operations like save/load, undo, and predictions are effortless.
Debugging is much easier.
Optimizing is easier too, since the logic is in Lua, and without Defold API calls, any slow logic can be rewritten in C++.

The final code structure is quite simple: just a series of functions called in a particular order. Logic modules respond to user input and modify the state, followed by view modules that react to the state change and pass control to the next view module by calling a callback etc.

I’m thinking about wrapping it all in some kind of generic framework and releasing it to the public, but it’s a huge effort in itself, especially considering that the approach evolves from project to project, with some ideas not being good enough to stay and getting replaced in the next iteration (the next project). As a result, it’s spread across a few projects at different stages, and it requires a lot of attention to bring everything together, experiment with it to determine the best way to generalize it, and then wrap it up into a well-structured and well-documented framework.

Pawel · September 23, 2024, 9:42pm

Pure immutability would make it impossible to write loops, which I find silly (recursion is used, but because of this I hated Erlang).

Immutability is imho kind of a defensive programming. During development it might help you spot issues in your code, when you want to “hack” your program, by modifying your state (of object, data, table, any structure) in an unsuitable place. Your architecture modify data anyway. But in DOP, instead of modifying data, you are switching reference to a new modified data, a new instance.

o solve issue with deep copying whole data it is proposed to use powerful “structural sharing”:

You might not notice it, but DOP describes… Git.

All the features introduced in Git are very well suiting DOP - you don’t change the data, you just make a snapshot of changes and join it with the rest of data that was unchanged (structural sharing), then you “commit” your changes, by changing a reference to your newest data state. This is were DOP shines and really serves the purpose. Other useful thing are distributed systems, microservices, or in parallelisation, where relying on immutable data and system state updates can perform really well.

Lua also tought me how Composition_over_inheritance is so great

This is a very good approach (like the one I used in this Tic Tac Toe example). It should always be separated. I made many, many prototypes were it wasn’t. It’s just the answer to “Why it’s so hard to finish a game?” or anything

Almost all architectures or applications, especially GUI heavy, revolve around such idea (MVC, MVVM) and I used them in software development job - why not using it in games too?

I remember (it was probably maybe even me) a question about good architecture in games and I was looking for such an answer for a long time, checking out different approaches, always receiving an answer that “it depends” and I couldn’t agree with it just like that. I was looking for a sweet spot, only to come to a conclusion that really “it depends”… I made a circle, but gained a baggage of knowledge along the way, that’s why this meme template is so relatable

BUT

when you’re doing any software that is larger than Tic Tac Toe, a good architecture is a solid foundation for making your life easier. Of course, it might need refactoring along the way to make it always fresh, but when the foundation is solid, you can build a higher tower more easily.

I find this set of rules really good for progammers:

It’s indeed a tough task. I’m right now studying @Insality’s Shooting Circles and all the very cool stuff he made, that I always wished to do, but never finished (like tiny-ecs project I started around 3 years ago and is still only on my local drive), and even though the documentation and example is amazing and I don’t think it can be done any better tbh, there is a lot of put in here and with Panthera and Detiled it looks like a perfect solution, it has a very high entry level to get in, but I believe it will be beneficial to learn, because it surely can speed up development a ton!

Insality · September 24, 2024, 8:00am

Glad you like it! Thanks