multibase icon indicating copy to clipboard operation
multibase copied to clipboard

provide recomandations to new implementations on what multibase should be considered baseline

Open Jorropo opened this issue 2 years ago • 4 comments

I recently started working on a new implementation and I wasn't sure if what I think is required as a baseline is consensual, after discussing with a few peoples it seems to be fairly consensual on:

  • f, base16 (upper or not, decoder case insensitive), Hex is easy for monkeys to parse. Almost all apps also already have a hex serializer and decoder included anyway.
  • b, base32 (upper or not, decoder case insensitive), It is a good default because, it is a power of 2 and thus is fast-ish to encode and decode with bit twiddling, it is case insensitive which fits well in the host section of the URI spec (most URI implementations canonicalize hosts to lowercase).
  • k, base36 (upper or not, decoder case insensitive), like base32 it fits in the host URI section, while being more compact, this allows to edge out a few more bytes out of DNS records.
  • z, base58, tbh I don't know why people like it so much but it's used in the wild a lot, would be nice to add a rational or remove it from that list.
  • u, base64url, power of 2, compact and easy to implement while staying ASCII.
  • m, base64, power of 2, sees more use in the wild than base64url but isn't compatible with URLs

~~We maybe also would want to include:~~

  • ~~Z, base58flickr~~
  • ~~v, base32hex~~
  • ~~h, base32z~~

~~Because they can be implemented with alphabet changes of other encodings which you should probably support anyway, which limit codesize and complexity growth.~~ Edit: never seen them deployed in the wild, the argument still stand, people should implement if they feel the cost is extremely low on them but we don't care to recommend them.


I think it would also be fair to say this is just the decoder list, for the encoder as long as you support any of, base16, base32 or base36 you are good.

Jorropo avatar Oct 25 '23 07:10 Jorropo

Reasonable list, but it's also close to the ones listed as final in the table, maybe that's a good enough list.

base58btc is quite popular in crypto land, built for human-readability with the ambiguous characters removed; I've been surprised at the places it shows up, but mostly crypto-adjacent.

I've never seen base58flickr used in the wild or base32z or base32hex.

base64 I think I see more frequently than base64url; I'd include both in a baseline list.

rvagg avatar Oct 25 '23 08:10 rvagg

I've never seen base64 used in IPFS land because it is ambiguous in a URL and break the gateway API. If we consider outside of IPFS or multibase applications I've seen base64pad used way more than any other alternative but I don't know how padding is useful. I guess we could say all RFC4648 bases and k and z but as you pointed out I've never seen base32hex anywhere.

Jorropo avatar Oct 25 '23 08:10 Jorropo

https://ipld.io/specs/codecs/dag-json/spec/#bytes is one notable place for m

rvagg avatar Oct 27 '23 00:10 rvagg

Late to the party here, but I would add that base64url/u (not base64/m) is foundational in the OIDC world, so any multiformats system that wanted to be able to encode keys as, say, JWKs for OAuth and/or OIDC interop purposes might like to have that option? I feel like having an IPFS community-internal "baseline" that's a subset of final entries is fine, and could be published as, say, an informational IPIP, but if multiformats is trying for a move to IANA and a more general audience, base64url is generally useful!

See, for example: https://www.w3.org/TR/controller-document/#multibase-0 (permalink)

bumblefudge avatar Jun 27 '24 07:06 bumblefudge