Struct unmarshmaling and more
This is ongoing PR but i wanted to show what i'm working on right now.
There is 2 major changes here:
- Support for querying multiple keys at once
- Struct unmarshmaling, same way as
encoding/jsonworks (use feature above)
I added new API function KeyOffset which returns offsets in base data structure for all keys at once, so it can do less reads.
Here is example taken from benchmark:
r := smallFixture
offsets := jsonparser.KeyOffsets(r,
[]string{"uuid"},
[]string{"tz"},
[]string{"ua"},
[]string{"st"},
)
jsonparser.Get(r[offsets[0]:])
jsonparser.GetInt(r[offsets[1]:])
jsonparser.Get(r[offsets[2]:])
jsonparser.GetInt(r[offsets[3]:])
As you can see from UX perspective it looks way worse, but it can give significant speedup, see benchmarks below. As future improvement i wanted implement iterator like access, similar to ArrayEach which reduce number of reads even more.
Based on feature above, i implemented basic version of Unmarshaler, you can check the code and bechmarks (looks not bad, considering that it does not use code generation). As unique feature i plan add support for nested keys, so you can specify them like:
struct Person {
githubFollowers int `json:"person.github.followers_count"`
...
}
Here is initial benchmarks:
BenchmarkJsonParserLarge 100000 75014 ns/op 32 B/op 2 allocs/op
BenchmarkJsonParserLargeOffsets 100000 81327 ns/op 528 B/op 33 allocs/op
BenchmarkJsonParserMedium 500000 15577 ns/op 48 B/op 3 allocs/op
BenchmarkJsonParserMediumOffsets 1000000 7757 ns/op 80 B/op 4 allocs/op
BenchmarkJsonParserSmall 5000000 1289 ns/op 0 B/op 0 allocs/op
BenchmarkJsonParserSmallOffsets 10000000 1065 ns/op 32 B/op 1 allocs/op
BenchmarkJsonParserSmallStruct 2000000 3280 ns/op 384 B/op 11 allocs/op
KeyOffsets can be optimized much more in my view.
/cc @daboyuka @alexcesaro @ketilovre @adrianblynch @LrsK
This looks really cool. Thanks for keeping the offset-computing code separate so as to not slow down the simpler searchKeys case (this is nice for me since I'm not sure the multi-key search will be needed for my use case).
What if you made a wrapper around the core functions for a more user-friendly experience? E.g.:
lookup := NewMultiKeyLookup(data)
refs := lookup.MultiGet(
[]string{"a"},
[]string{"b"},
[]string{"c"},
)
val1 := lookup.Get(ref[0])
val2 := lookup.GetInt(ref[1])
val3 := lookup.Get(ref[2])
The refs returned by MultiGet would probably just be the offset to the value, but could be typedefed to be something meaningful (valueref or something).
The idea is to hide the need for slicing the data and calling Get() for each offset lookup inside the API.
Just thinking out loud, not sure this is actually a good idea :).
Yeah, this could be a good idea, and we can hide caching inside.
Well actually syntax above is not supported, as Go do not support variable list of returned arguments
Doh, you're right, I guess I meant refs := and then refs[0], etc. Edited.
I want to test this soon with some big data-sets and will supply the results here. Just by mere conversion from ffjson for inspections on JSON on multiple of my API's I'm seeing very reliable 9,21 (1k entries) to 9,38 (100k entries) times more speed and way lower footprint. Can't wait to try this out next! Thanks so much for this amazing contribution!
Update: Maybe there's also focus points here. These are the hotspots on my pprof:
(pprof) top5
1060ms of 1260ms total (84.13%)
Showing top 5 nodes out of 48 (cum >= 110ms)
flat flat% sum% cum cum%
470ms 37.30% 37.30% 470ms 37.30% github.com/buger/jsonparser.stringEnd
210ms 16.67% 53.97% 210ms 16.67% runtime.usleep
150ms 11.90% 65.87% 460ms 36.51% github.com/buger/jsonparser.blockEnd
140ms 11.11% 76.98% 320ms 25.40% github.com/buger/jsonparser.searchKeys
@Allendar also check this change that was merged yesterday https://github.com/buger/jsonparser/pull/45
Also, I'll be frank, this PR is not ready yet and supports only simple structs.
@buger Is work continuing on this effort?
Doh, you're right, I guess I meant
refs :=and thenrefs[0], etc. Edited.
Doh, you're right, I guess I meant
refs :=and thenrefs[0], etc. Edited.
Have you considered some variation on:
type ValueOffset int
type ValueOffsets []ValueOffset
lookup := jsonparser.ParseKeys(make([]ValueOffset, 3)[:],
[]string{"a"},
[]string{"b"},
[]string{"c"},
)
val1 := lookup.Get(0)
val2 := lookup.GetInt(1)
val3 := lookup.GetUnsafeString(2)