Converting URLs to strings

I need to build a table using sender URLs as keys in my on_message handler.

Since an URL is an userdata and, apparently, is not unique (i.e. same URL string may correspond to multiple URL objects), I can not use the sender variable directly as a key in my table.

I thought I use tostring(), and it works for the default build. However it is broken in HTML5 build:

print(msg.url("main:/game#gui"))

yields

DEBUG:SCRIPT: url: [main:/game#gui]

in default build, but

DEBUG:SCRIPT: url: [main:/(null)#(null)]

in the HTML5 one :frowning:

Any suggestions on how can I work around this?

Thanks for posting! There are a few things to address here.

First off, you can’t use the URL object as a table key (as you already found out). Many people try to do this, so we should look into this and see if we could make that happen. It’s not trivial based on how lua tables work, but there is maybe something we could do.

URLs are unique. What strings result in multiple objects? I’m wondering if you mean the resolving facility, where we let you specify an incomplete URL and that is resolved into the most likely URL, based on the site where you made the call. So just doing msg.url("#thing") from different places correctly results in different URLs, because that string is not actually a URL, but a partial URL. The resolving facility is there for convenience, similar to how you can address anchors within a HTML page.

The URL structure contains hashes, and we have a reverse-hashing feature in dev mode. We store every hash-call and reverse-map the string it was produced from, so that we can print out the originating string. This is a debug function, so it’s disabled for release-builds, which is why it’s not working in HTML5 (and would not work on iOS nor Android, etc).

Your problem can be resolved in many different ways, the best one would depend on your use case. If possible, you could key into your table by the path-part alone (this would correspond to the game object id, if the URLs you are dealing with comes from game objects).
Something like: local t = {[url.path] = some_value}

Hope this helps!

The following should work, right? It seems it should, since socket is just a number, path is a hash and fragment is a hash too.

function url_to_hash(url)
	return hash(table.concat({url.socket, hash_to_hex(url.path), hash_to_hex(url.fragment)}))
end

...

local id = url_to_hash(msg.url("#"))
local thing = {[id] = "hello"}

Ragnar,

thank you for your answer!

First off, you can’t use the URL object as a table key (as you already found out). Many people try to do this, so we should look into this and see if we could make that happen. It’s not trivial based on how lua tables work, but there is maybe something we could do.

Is there a reason why URL is not a plain string?

URLs are unique. What strings result in multiple objects? I’m wondering if you mean the resolving facility, where we let you specify an incomplete URL and that is resolved into the most likely URL, based on the site where you made the call. So just doing msg.url(“#thing”) from different places correctly results in different URLs, because that string is not actually a URL, but a partial URL. The resolving facility is there for convenience, similar to how you can address anchors within a HTML page.

I’m implementing a kind of pub-sub mechanism and I’m using the sender field from the on_message in my pub-sub manager game object to subscribe that sender to the channel.

local subscribe = function(self, sender_url, channel_id)
    local channel = self.subscribers[channel_id]
    if not channel[sender_id] then
        channel[#channel + 1] = sender_url 
        channel[tostring(sender_id)] = #channel
    end     
end

The URL structure contains hashes, and we have a reverse-hashing feature in dev mode. We store every hash-call and reverse-map the string it was produced from, so that we can print out the originating string. This is a debug function, so it’s disabled for release-builds, which is why it’s not working in HTML5 (and would not work on iOS nor Android, etc).

I see. Perhaps it would be better to make this feature a bit more explicit? When behaviour of debug and release is different, it often causes nasty surprises.

Your problem can be resolved in many different ways, the best one would depend on your use case. If possible, you could key into your table by the path-part alone (this would correspond to the game object id, if the URLs you are dealing with comes from game objects).
Something like: local t = {[url.path] = some_value}

I will try this, thanks! Is it documented anywhere?

Something like: local t = {[url.path] = some_value}

Looks like I rather need tostring(url.socket) .. ":" .. tostring(url.path) .. "#" .. tostring(url.fragment), which yields something like "<number>:hash: [<number> (unknown)]#hash: [<number> (unknown)]".

I wonder why URL’s tostring doesn’t do that, but returns "main:/(null)#(null)" instead.

Doing table.concat like in my solution is faster and easier on the garbage collector than doing regular string concatenation. And you don’t need to tostring socket. It is just a number. Also, while I haven’t benchmarked, I would be surprised if hash_to_hex isn’t faster than tostring on a hash. The thing lacking from my solution is if the url doesn’t have a fragment, then hash_to_hex will fail because it is nil. So in the end it needs to look like this:

(and the final hashing is optional if you want to use the whole string instead as key)

function url_to_hash(url)
	return hash(table.concat({url.socket, hash_to_hex(url.path), hash_to_hex(url.fragment or hash(""))}))
end

The string going in to the hash here looks like this:

msg.url() which contains the fragment gives: 786443248a28c478b88bae81d7db51da16c7fd
msg.url(".") which doesn’t have the fragment gives: 786443248a28c478b88bae0000000000000000

But I do think it is cleaner to pass around and use a hash (I assume 4 or 8 bytes) of this anyways, rather than these long strings.

Fredrik, thank you for your insight!

A couple of points:

I am not sure if it is true. You do create a garbage table. But several .. in a row (up until some limit which we do not hit) will be compiled to a single bytecode instruction, which creates no garbage.

I find such things often counterintuitive. Benchmarks rule.

Not sure if hashing processing power and readability overhead is worth it.

I looked it up, and you are right. Table.concat is much much faster when you are adding strings together in steps, like in a loop. But slower when concating them in one line like this. Good to know.

Best thing would be if Defold simply made it so that we could use urls as indexers.

1 Like