ulid optimize parsing ulid

In this pull request, we have improved the performance of Parse. Instead of directly converting ULID to a [16]byte, it is converted through multiple uint64 variables. This change allows us to reduce the number of bit operations, leading to faster performance.

The result of benchmark is here:

goos: darwin
goarch: arm64
pkg: github.com/oklog/ulid/v2
               │   .old.txt   │              .new.txt               │
               │    sec/op    │   sec/op     vs base                │
Parse-10          9.158n ± 1%   6.051n ± 1%  -33.92% (p=0.000 n=10)
ParseStrict-10   14.515n ± 1%   6.106n ± 1%  -57.94% (p=0.000 n=10)
MustParse-10     10.095n ± 0%   7.076n ± 0%  -29.91% (p=0.000 n=10)
geomean           11.03n        6.394n       -42.03%

               │   .old.txt   │               .new.txt                │
               │     B/s      │     B/s       vs base                 │
Parse-10         2.644Gi ± 1%   4.002Gi ± 1%   +51.35% (p=0.000 n=10)
ParseStrict-10   1.668Gi ± 1%   3.966Gi ± 1%  +137.79% (p=0.000 n=10)
MustParse-10     2.399Gi ± 0%   3.422Gi ± 0%   +42.64% (p=0.000 n=10)
geomean          2.195Gi        3.787Gi        +72.51%

               │   .old.txt   │              .new.txt               │
               │     B/op     │    B/op     vs base                 │
Parse-10         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseStrict-10   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
MustParse-10     0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                     ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

               │   .old.txt   │              .new.txt               │
               │  allocs/op   │ allocs/op   vs base                 │
Parse-10         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseStrict-10   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
MustParse-10     0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                     ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

Mar 04 '24 13:03 shogo82148

I've misunderstood something here, let me revert.

Mar 13 '24 14:03 peterbourgon

This code relies on the behavior of sign extension. Therefore, dec must be a negative number.

Mar 13 '24 23:03 shogo82148

Please look at this PR from the perspective of a future maintainer. Help them understand how this works, and why it's better than the previous approach, with documentation.

Apr 07 '24 03:04 peterbourgon

I updated the description of the pull request, and comments in the source. Can you take a look please?

Apr 07 '24 13:04 shogo82148

@peterbourgon I would even be happy to have better performance in my downstream applications, even if I do not understand all the bit juggling. Just saying.

Apr 15 '24 12:04 oderwat

Of course, but when this code is merged, it becomes my responsibility, so I need to understand it, even if you don't :)

Apr 15 '24 14:04 peterbourgon

@peterbourgon I see. Thank you for this very nice package. I hope you will find the time to understand what he did soon.

Apr 15 '24 14:04 oderwat

// We use -1 (all bits are set to 1) as sentinel value for invalid indexes.
// The reason for using -1 is that, even when cast, it does not lose the property that all bits are set to 1.
// e.g. uint64(int8(-1)) == 0xFFFFFFFFFFFFFFFF

It's simply not true that the value -1 means that "all bits are set to 1" -- exactly how a const value like -1 is represented in memory is subject to a bunch of subtle details related to types, host architecture, endian-ness, etc., which are not adequately documented or accommodated by this PR.

I appreciate the intent and effort here, but these changes are fragile and unreliable and not anything that I can commit to supporting or maintaining going forward.

Jun 09 '25 16:06 peterbourgon