Where to put data: Lua vs JSON

Using a Lua file instead of JSON may be a convenient way to store data in [[Defold]], especially for small files. However, this approach has its downsides, especially when dealing with very large files.

Let’s compare these four approaches:

  • Lua file of 9.1Mb loaded with require. In this case it’s processed as a code file (compilation, etc.)
local data = require("main.lua.data")
  • the same Lua file loaded as a custom resource using loadstring()
self.data = loadstring(sys.load_resource("/main/lua_data/lua_data.lua"))()
  • JSON file encoded from this Lua table, which is 8.1Mb.
self.data = json.decode(sys.load_resource("/main/json/our_file.json"))
  • Defold has its own serializer: sys.serialize() and sys.deserialize(). The same blob of data serialized and saved on disk is 14.7Mb.
self.data = sys.deserialize(sys.load_resource("/main/sys/sys_data"))

It’s important to understand that require adds a Lua module as a dependency for the collection, so loading occurs not when require is called in the code, but when the collection is loaded. This test is designed to take that into account. Each approach loads its own collection and performs an operation in init() (for require, this is not needed).

Caching of Lua modules

All Lua modules loaded with require are cached by Lua in package.loaded. This aspect is essential to remember if you intend to use data in Lua and require it. To unload these modules, you must manually clean them up when you unload a collection (or at any moment you consider it necessary):

package.loaded["your.package.here"] = nil

Build size

Here are the file sizes in different builds:

  • LuaJIT with one architecture (Android x64 taken as an example)
  • LuaJIT with both architectures (Android x64+x32 taken as an example)
  • Plain Lua (HTML5 target always uses this one)

Compressed

json.decode() loadstring(Lua) require(Lua) sys.deserialize()
Android x64
(LuaJIT one arch)
3.08 MB 3.21 MB 3.57 MB 4.33 MB
Android x64+x32
(LuaJIT two archs)
3.08 MB 3.21 MB 7.18 MB 4.33 MB
HTML5
(plain Lua)
3.08 MB 3.21 MB 3.21 MB 4.33 MB

For plain Lua, the size remains the same as the Lua file in the project. However, for LuaJIT, it’s the compiled version, which needs to be compiled for each architecture separately. This should be considered when using data in Lua code (requiring Lua files).

Additionally, here is information about file sizes before compression, which is necessary for a better understanding of what exactly consumes memory:

Uncompressed

json.decode() loadstring(Lua) require(Lua) sys.deserialize()
Android x64
(LuaJIT one arch)
7.77 MB 8.67 MB 9.85 MB 14.05 MB
Android x64+x32
(LuaJIT two archs)
7.77 MB 8.67 MB 19.70 MB 14.05 MB
HTML5
(plain Lua)
7.77 MB 8.67 MB 8.67 MB 14.05 MB

Loading/decoding speed

The result is the average of 5 calls, with the application loaded from scratch for each approach. Release bundle. It is obtained after the following steps:

  • Launch the app
  • Load the collection
  • Record data for the first load
  • Unload the collection
  • Collect garbage
  • Repeat the process 5 times
  • Record data for the average time

Average 5 loads

json.decode() loadstring(Lua) require(Lua) sys.deserialize()
Android x64
(LuaJIT one arch)
0.2649 0.7514 0.1883 0.2882
Android x64+x32
(LuaJIT two archs)
0.2691 0.7516 0.2298 0.2466
HTML5
(plain Lua)
0.3864 0.4557 0.4596 0.0680
Mac arm64
(LuaJIT)
0.0714 0.1851 0.0480 0.0567
iOS arm64
(LuaJIT interpreter)
0.2863 0.9674 0.1778 0.2631

For many games you need to load it only once, so it makes sense to have a cold load (only the first load time):

The first load

json.decode() loadstring(Lua) require(Lua) sys.deserialize()
Android x64
(LuaJIT one arch)
0.2614 0.6377 0.2659 0.2626
Android x64+x32
(LuaJIT two archs)
0.2668 0.7145 0.4152 0.2485
HTML5
(plain Lua)
0.4150 0.4700 0.5090 0.0730
Mac arm64
(LuaJIT)
0.0729 0.1895 0.0945 0.0608
iOS arm64
(LuaJIT interpreter)
0.2862 1.0037 0.2485 0.3078

*Android is Xiomi Readmi Note 4
** iOS is iPhone 7

Lua memory

It’s important to measure both:

  • The extent of memory spikes during the parsing process
  • The amount of memory the result table occupies

These measurements are obtained after the following steps:

  • Launch the app
  • Collect garbage
  • Load the collection
  • Record the value as a memory spike (the amount of Lua memory required for the parsing process)
  • Collect garbage
  • Record the value as the memory usage of the table

Spike

json.decode() loadstring(Lua) require(Lua) sys.deserialize()
Android x64
(LuaJIT one arch)
25.21 44.91 30.77 31.47
Android x64+x32
(LuaJIT two archs)
25.21 44.91 30.77 31.47
HTML5
(plain Lua)
31.53 37.82 29.14 37.82
iOS arm64
(LuaJIT interpreter)
25,19 44.91 30.77 31.47

Memory

json.decode() loadstring(Lua) require(Lua) sys.deserialize()
Android x64
(LuaJIT one arch)
17.41 17.41 15.84 17.41
Android x64+x32
(LuaJIT two archs)
17.41 17.41 15.84 17.41
HTML5
(plain Lua)
23.76 20.67 20.67 23.76
iOS arm64
(LuaJIT interpreter)
17.41 17.41 15.84 17.41

* Android is Xiomi Readmi Note 4
** iOS is iPhone 7

Application memory

Depending on the operating system, memory allocation may appear different for the application at the OS level. Here are measurements of how it looks on different setups.
In profiler it looks like this in Xcode instruments:
CleanShot 2024-02-08 at 15.28.51@2x
Android Studio:

Android* x64 (LuaJIT one arch)

After app run Parsing spike After parsing
json.decode() 51 82.7 82.7
loadstring(Lua) 51 103.5 103.5
require(Lua) 51 102.1 102.1
sys.deserialize() 51 100.2 97.4

Android* x64+x32 (LuaJIT two archs)

After app run Parsing spike After parsing
json.decode() 51 83.1 83.1
loadstring(Lua) 51 104.6 104.6
require(Lua) 51 115.7 115.7
sys.deserialize() 51 102.4 102.4

iOS** arm64 (LuaJIT interpreter)

After app run Parsing spike After parsing
json.decode() 76 91.54 83.83
loadstring(Lua) 76 94.95 86.26
require(Lua) 76 96.65 96.65
sys.deserialize() 76 105.74 91,76

* Android is Xiomi Readmi Note 4
** iOS is iPhone Xs (connection issues with iPhone 7 → Xcode instruments)

The require approach occupies more disk space and, of course, requires more native memory to be loaded and parsed (see uncompressed file sizes).

Conclusion

Utilizing large chunks of data as Lua code (using require) will noticeably affect both build size and runtime memory usage. In this case, the load/parsing time is comparable to JSON speed.
You must be cautious about where you place data and how you use it, especially if it involves a huge amount of data.

The project is available here:

Please, ensure you test it in a release bundle.


UPDATE:

  • All the measurements use a newly generated blob of data.
  • Fixed a bug for loadstring() measurements.
  • Added a new approach with sys.serialize().
  • All the measurements for iPhone were made for iPhone 7, except for the system memory measurement, which is still for iPhone XS (which shouldn’t be much different).
32 Likes

Yaa, and I was thinking about putting all my JSON data (mainly dialogues) for Witchcrafter to Lua recently and I was thinking if I should somehow compare both solutions - you made me a favor by doing such a detailed comparison! Thank you! :heart:

6 Likes

I added graphs to make the data tables easier to read.

5 Likes

I indirectly found that when making my word game.
Initially the word list was a gigantic lua table that although was being loaded fine on desktop, failed miserably on mobile.
Switching the data to json solved the issue but I did not investigate it further.

2 Likes

Here is the test project. Let me know if you find any methodology issues.

6 Likes

I have made an update. I think now it’s complete.

5 Likes

Well done! I know we discussed this, and it is to some extent outside the scope of this test, but it would be interesting to also do a test with SQlite (I think we have a community created extension?)

4 Likes

I’d be interested in a comparison with protobuf.

2 Likes

This is a great breakdown. While working at fmad.io we were serializing many TB of stockmarket trade data (for Nasdaq) for overnight processing, using… luajit :slight_smile: … and we generally used json as an output (because most of the consumers of the data used a Grafana stack of some kind).

However, there are good alternatives depending on your use case and needs. Generally I would never recommend require as a data serialization for lua or luajit :slight_smile: Not just because of the package handling problems, but also the general “confusion to the developer” because imho its much clearer to define data serialization vs module management.

Some of the packages Ive used with luajit and lua:

  • luajit: used googles snappy (very easy to call with ffi) - good searches on large data sets
  • luajit: used lmdb (again easy with ffi) - brilliant if you want super fast search trees.
  • lua: binser (I kinda like this one, very agnostic, simple and stable). - see it on github.
  • lua: luabin (old C++ binding, like me) but pretty nice to use.

Theres alot out there but I do like json, since there are many good tools for it. Sometimes thats the key reason (not just perf :wink: ).

Oh, I should note. Ive also done more db like ones like sqlite and redis. These are kinda a second tier level above these lower tier high performance data serializers. And recommend them for bigger more complex use cases (like game data server management). But then you might end up just needing a client interface to postgress or similar, which is a whole different level of discussion.

5 Likes