optimize parsing ulid
In this pull request, we have improved the performance of Parse. Instead of directly converting ULID to a [16]byte, it is converted through multiple uint64 variables. This change allows us to reduce the number of bit operations, leading to faster performance.
The result of benchmark is here:
goos: darwin
goarch: arm64
pkg: github.com/oklog/ulid/v2
│ .old.txt │ .new.txt │
│ sec/op │ sec/op vs base │
Parse-10 9.158n ± 1% 6.051n ± 1% -33.92% (p=0.000 n=10)
ParseStrict-10 14.515n ± 1% 6.106n ± 1% -57.94% (p=0.000 n=10)
MustParse-10 10.095n ± 0% 7.076n ± 0% -29.91% (p=0.000 n=10)
geomean 11.03n 6.394n -42.03%
│ .old.txt │ .new.txt │
│ B/s │ B/s vs base │
Parse-10 2.644Gi ± 1% 4.002Gi ± 1% +51.35% (p=0.000 n=10)
ParseStrict-10 1.668Gi ± 1% 3.966Gi ± 1% +137.79% (p=0.000 n=10)
MustParse-10 2.399Gi ± 0% 3.422Gi ± 0% +42.64% (p=0.000 n=10)
geomean 2.195Gi 3.787Gi +72.51%
│ .old.txt │ .new.txt │
│ B/op │ B/op vs base │
Parse-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
ParseStrict-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
MustParse-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
geomean ² +0.00% ²
¹ all samples are equal
² summaries must be >0 to compute geomean
│ .old.txt │ .new.txt │
│ allocs/op │ allocs/op vs base │
Parse-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
ParseStrict-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
MustParse-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
geomean ² +0.00% ²
¹ all samples are equal
² summaries must be >0 to compute geomean
I've misunderstood something here, let me revert.
This code relies on the behavior of sign extension. Therefore, dec must be a negative number.
Please look at this PR from the perspective of a future maintainer. Help them understand how this works, and why it's better than the previous approach, with documentation.
I updated the description of the pull request, and comments in the source. Can you take a look please?
@peterbourgon I would even be happy to have better performance in my downstream applications, even if I do not understand all the bit juggling. Just saying.
Of course, but when this code is merged, it becomes my responsibility, so I need to understand it, even if you don't :)
@peterbourgon I see. Thank you for this very nice package. I hope you will find the time to understand what he did soon.
// We use -1 (all bits are set to 1) as sentinel value for invalid indexes.
// The reason for using -1 is that, even when cast, it does not lose the property that all bits are set to 1.
// e.g. uint64(int8(-1)) == 0xFFFFFFFFFFFFFFFF
It's simply not true that the value -1 means that "all bits are set to 1" -- exactly how a const value like -1 is represented in memory is subject to a bunch of subtle details related to types, host architecture, endian-ness, etc., which are not adequately documented or accommodated by this PR.
I appreciate the intent and effort here, but these changes are fragile and unreliable and not anything that I can commit to supporting or maintaining going forward.