ULIDs contains `ILOU` will be parsed as weird timestamps
Hi! I'm writing a new Ruby library for handling ULID in these days. Now Iām testing other implementations examples in https://github.com/kachick/ruby-ulid/issues/53.
And I have found weird examples in original repository as https://github.com/ulid/javascript/pull/85.
And then checked the parser of this library, because I'm using this in a Go project, it is so useful! š
Using this command line tool as below, the version is https://github.com/oklog/ulid/tree/e7ac4de44d238ff4707cc84b9c98ae471f31e2d1
$ ulid -h
Usage: ulid [-hlqz] [-f <format>] [parameters ...]
-f, --format=<format> when parsing, show times in this format: default, rfc3339, unix, ms
-h, --help print this help text
-l, --local when parsing, show local time instead of UTC
-q, --quick when generating, use non-crypto-grade entropy
-z, --zero when generating, fix entropy to all-zeroes
$ ulid 01111111111111111111111111
Mon Dec 19 08:09:04.801 UTC 2005
$ ulid 0LLLLLLLLLLLLLLLLLLLLLLLLL # `L` is same as `1` in https://www.crockford.com/base32.html, but returned different value
Tue Aug 02 05:31:50.655 UTC 10889
$ ulid 0UUUUUUUUUUUUUUUUUUUUUUUUU # `U` is invalid in https://www.crockford.com/base32.html, but does not raise error
Tue Aug 02 05:31:50.655 UTC 10889
$ ulid 00000000000000000000000000
Thu Jan 01 00:00:00 UTC 1970
$ ulid 0OOOOOOOOOOOOOOOOOOOOOOOOO # `O` is same as `0` in https://www.crockford.com/base32.html, but returned different value
Tue Aug 02 05:31:50.655 UTC 10889
In my understanding, Crockford's base32 does not contain L I O for the encoded product. So I think ULID can handle them as invalid values š¤ ref: https://github.com/ulid/spec/issues/38, https://github.com/kachick/ruby-ulid/issues/57
Interesting. I think the relevant rules are
When decoding, upper and lower case letters are accepted, and i and l will be treated as 1 and o will be treated as 0. When encoding, only upper case letters are used.
Hyphens (-) can be inserted into symbol strings. This can partition a string into manageable pieces, improving readability by helping to prevent confusion. Hyphens are ignored during decoding.
I think we are not doing the bold parts.
@tsenart Think we can add those things?
Thanks for your comment!
I think we are not doing the bold parts.
Agreed, and I think ignoring them is the desirable spec for actual use-case, rather than strict following original Crockford's base32. š
So I have suggested it in https://github.com/ulid/spec/pull/57 š
Ah, yes, and to just make it explicit, you wrote
Especially when [implementations] accept [the]
iIlLoOmapping, as [is suggested in the] original Crockford's base32 decoding spec, Lexicographically sortable is lost
which is a great point š Will wait for the outcome of that other PR...