Collection Parser?


Has anyone written a parser for Defold’s collection file format? Just checking before I start working on my own.

I tried to look at DefTree, which I thought might be doing it (albeit in python), but it seems like the source code has been moved or made private.

Assuming the answer is “No”, any tips or advice on doing it would be appreciated.

1 Like


Forgot about that - DefTree is now public again.



Assuming the answer is “No”, any tips or advice on doing it would be appreciated.

The format is in the Protobuf format, so if we could get access to the definition files then it would be quite easy to read them as that. Without them it is all about doing it the hardway.



@Jerakin Awesome, thanks!

@Mathias_Westerdahl, @sven You’re lurking so you’re getting tagged :slight_smile: Any chance we can get our hands on those definition files? I know sicher commented a while ago:

…but if it’s just a matter of sharing some .proto files when it changes (which I assume is fairly rare), that doesn’t seem too much to ask.



Haha, I’m always lurking! :smiley: But yes, I can see the benefit but also the reasoning why we haven’t shared them yet.
In my own opinion it makes sense to share these, maybe along with Defold SDK headers in some way. We should bring it up in the team once more and discuss it.



Cool, thanks! :defold:

Meanwhile I am going at it the hard way. Throwing together something hacky to parse it wasn’t actually too bad, but then I remembered that to do anything useful I would have to put it back into the original format, and I died a little inside (and am now rewriting). Haha. I guess it wasn’t really designed to be easily legible. Some parts are really clean and easy, but then you get stuff like this:


…which isn’t exactly ideal. It would be so much nicer if it was more like this:


But anyway. I’m getting there!



Oh yeah - tell me about it. That stuff was what I had to spend most of my time on. My ugly code have a lot of “do this, well except if the tag is data then do something completely different”. I think I ended up “unescaping” it first, but of course the data can also have data in it… so yeah - have fun :upside_down_face:

What are you writting it in?



Yup, that blasted data tag! Everything else is totally simple! Yeah, first I un-indent the line. Then if it’s inside a ‘data’ block I un-escape, un-quote, un-indent again, and remove the newline “\n” from the end. I probably don’t need to explicitly un-indent, but I was trying to set it up so I could just do everything in reverse to put it back… The above examples are before and after running it through my “cleaner” script. Now I’m trying to finish up the reverse process, trying to figure out the last few annoying exceptions to the rules.

I’m doing it with Lua. Following my tradition of making tools for Defold with Defold. :slight_smile:



Well folks…it works. (barely (and with caveats))



Have you worked on this some more?

I bet it would be awesome with Editor Scripts 🔥: Alpha Release, we don’t have access to the protobufs yet so this could be neat. :grin:

Bet the community wouldn’t mind helping out on it :slight_smile:



I agree. The new editor script system seems almost tailor-made for messing with collections like this. All the more reason they should release the protobuf specs! Or at least change how that darn “data” tag works so it’s not such a mess, hah.

Here you go:

I just cleaned it up a tiny bit, but it’s kind of “as is”. It seems to be fairly solid, but I’m sure you could break it fairly easily if you named your objects weird or something, and it doesn’t have . . . any error checking? :slight_smile:

So it pretty much just converts the collection (or game object, or component) file directly into a lua table as you would expect from the original format. It puts a few extra things in there since I was using this to make my own editor, but I think you can just ignore those. Where Defold’s files use the same tag repeatedly (like “embedded_instances”), it converts them into a sequence table. Just try opening files and pprint-ing the output to see what it’s like.



Yes, this will happen soon. My current best guess is end of October.




I just pushed some little tweaks to my parser so it works with editor_scripts. They don’t have access to vmath, so the parser can’t convert transform data to vectors and quats, so they just stay as normal tables with x, y, z, w fields. They also don’t have socket, which I was using to check the time it took to parse, so I just took that out. And I removed or commented out all the debug print statements.

Here’s a stupid-simple little editor_script to count the number of embedded game objects in a collection file:
(Put it in a folder with the parser module and double-check the require path.)

local M = {}

local parser = require "editor_scripts/collection_parser"

local commands = {
        label = "Count Embedded Instances",
        locations = {"Assets", "Outline"},
        query = {
            selection = { type = "resource", cardinality = "many" }
        run = function(opts)
            for i,node_id in ipairs(opts.selection) do
                local path = editor.get(node_id, "path")
                path = string.sub(path, 2) -- Cut off "/" prefix.
                local extension = string.match(path, "^.*%.(.*)$")
                if path and extension == "collection" then
                    local file, err =, "r")
                    if not file then
                        local data = parser.decodeFile(file, path)
                        if data then
                            local inst = data.embedded_instances
                            local count = inst and #inst or 0
                            print("", count .. " embedded_instances in " .. path .. ".")

function M.get_commands()
    print("Collection Parser Test Extension Loaded.")
    return commands

return M
1 Like