Simple test case is producing allocations, is this intended?
I'm not sure if I'm doing something wrong, but I was surprised this produced allocations:
BenchmarkParse
BenchmarkParse-8
300330 4085 ns/op 272 B/op 4 allocs/op
PASS
func BenchmarkEachKey(b *testing.B) {
b.ResetTimer()
for i := 0; i < b.N; i++ {
var id string
var bidId string
var bidImpId string
var bidPrice string
var bidAdm string
var cur string
var foo string
bodyBytes := stringHelper.StringAsBytes(BidResponse_Banner)
jsonparser.EachKey(bodyBytes, func(idx int, value []byte, vt jsonparser.ValueType, err error) {
switch idx {
case 0:
id = stringHelper.BytesAsString(value)
case 1:
bidId = stringHelper.BytesAsString(value)
case 2:
bidImpId = stringHelper.BytesAsString(value)
case 3:
bidPrice = stringHelper.BytesAsString(value)
case 4:
bidAdm = stringHelper.BytesAsString(value)
case 5:
cur = stringHelper.BytesAsString(value)
case 6:
foo = stringHelper.BytesAsString(value)
}
},
[][]string{
{"id"},
{"seatbid", "[0]", "bid", "[0]", "id"},
{"seatbid", "[0]", "bid", "[0]", "impid"},
{"seatbid", "[0]", "bid", "[0]", "price"},
{"seatbid", "[0]", "bid", "[0]", "adm"},
{"cur"},
{"foo"},
}...,
)
if "9f3701ef-1d8a-4b67-a601-e9a1a2fbcb7c" != id ||
"0" != bidId ||
"1" != bidImpId ||
"0.35258" != bidPrice ||
`some \"html\" code\nnext line` != bidAdm ||
"USD" != cur ||
"" != foo {
panic("ack")
}
}
}
const BidResponse_Banner = `
{
"id":"9f3701ef-1d8a-4b67-a601-e9a1a2fbcb7c",
"seatbid":[
{
"bid":[
{
"id":"0",
"adomain":[
"someurl.com",
"someotherurl.com"
],
"impid":"1",
"price":0.35258,
"adm":"some \"html\" code\nnext line",
"w":300,
"h":250
}
]
}
],
"cur":"USD"
}
`
This is happening inside the jsonparser.EachKey function:
ROUTINE ======================== github.com/buger/jsonparser.EachKey in C:\Users\rhaksi\go\pkg\mod\github.com\buger\[email protected]\parser.go
69MB 69MB (flat, cum) 97.85% of Total
. . 389: if len(p) > maxPath {
. . 390: maxPath = len(p)
. . 391: }
. . 392: }
. . 393:
19.50MB 19.50MB 394: pathsBuf := make([]string, maxPath)
. . 395:
. . 396: for i < ln {
. . 397: switch data[i] {
. . 398: case '"':
. . 399: i++
. . 400: keyBegin := i
. . 401:
. . 402: strEnd, keyEscaped := stringEnd(data[i:])
. . 403: if strEnd == -1 {
. . 404: return -1
. . 405: }
. . 406: i += strEnd
. . 407:
. . 408: keyEnd := i - 1
. . 409:
. . 410: valueOffset := nextToken(data[i:])
. . 411: if valueOffset == -1 {
. . 412: return -1
. . 413: }
. . 414:
. . 415: i += valueOffset
. . 416:
. . 417: // if string is a key, and key level match
. . 418: if data[i] == ':' {
. . 419: match := -1
. . 420: key := data[keyBegin:keyEnd]
. . 421:
. . 422: // for unescape: if there are no escape sequences, this is cheap; if there are, it is a
. . 423: // bit more expensive, but causes no allocations unless len(key) > unescapeStackBufSize
. . 424: var keyUnesc []byte
49.50MB 49.50MB 425: var stackbuf [unescapeStackBufSize]byte
. . 426: if !keyEscaped {
. . 427: keyUnesc = key
. . 428: } else if ku, err := Unescape(key, stackbuf[:]); err != nil {
. . 429: return -1
. . 430: } else {
DOH: Just noticed, a merge just went in that fixes var stackbuf [unescapeStackBufSize]byte, I have updated and the new code doesnt have this allocation happening.
Interesting 🤔
This change was part of the code to actually reduce allocations https://github.com/buger/jsonparser/pull/208
Interesting 🤔
This change was part of the code to actually reduce allocations #208
Huh, no idea why that fixed things, but you're right, your last released update fixes one of my 4 allocations, the other 3 can be addressed by:
For this line of code:
19.50MB 19.50MB 394: pathsBuf := make([]string, maxPath)
The problem is that variable sized allocations can never live on the stack, the standard go pattern for this is:
pathsBuf := make([]string, 64)[:0]
// add an item with:
pathsBuf = append(pathsBuf, foo)
In your case you could do:
pathsBuf := make([]string, 128)[:]
if maxPath > cap(pathsBuf) {
pathsBuf = make([]string, maxPath)[:]
}
pathsBuf = pathsBuf[0:maxPath]
Similarly for: pathFlags := make([]bool, len(paths)) and pIdxFlags := make([]bool, len(paths))
#223
PS: Ugh, I missed pIdxFlags, which when changed similarly via :
pIdxFlags := make([]bool, stackArraySize)[:]
if len(paths) > cap(pIdxFlags) {
pIdxFlags = make([]bool, len(paths))[:]
}
pIdxFlags = pIdxFlags[0:len(paths)]
Brings the allocation count to 0 in my test, all in giving a 10% ish speed bump.
@xsandr since you were the person who made the previous change, can you help with a review here? Thank you!
On a semi-related note, for ease of use I wrap the EachKeys function with a helper to do field bindings and costlessly marshal directly to a struct:
One side affect of the EachKey interface having a lambda/closure style is that the user provided function cant be inlined, I'd be curious how much difference it would make if rather than invoking a user lamba function per iteration it instead returned a single array of results and let the user iterate.
ie
results, err := jsonparser.EachKeys(bodyBytes, make([]jsonparser.EachKeysResult, 64)[:0], paths...)
for i:=0; i<len(results); i++) {
switch(results[i].Idx)
case 0:
id := BytesAsString(results[i].Value)
//etc
}
the current EachKey implementation could then call the above one so that external users can still use the lambda function style pattern.
I'm sad to report that I cant achieve nearly the same performance as I can with jsoniterator, although with the fixes EachKey having zero allocations is really nice. (ie jsoniterator is roughly 2X faster for my simple benchmark)
BenchmarkJsonIterator-8
1695078 2139 ns/op 832 B/op 18 allocs/op
BenchmarkEachKey-8
924060 3751 ns/op 96 B/op 3 allocs/op
See: https://github.com/Villenny/jsonparserHelper-go for my helper module. Note how parser generation is its own step prior to use, this means that you can create the parser once at program init time, so it doesnt matter if it has allocations, which means that you can build lookup maps in it to achieve O(1) mappings. I didnt go so far as to reflect the struct to populate the parser, but that would be fairly easy.
IMHO, although its more technically correct to use arrays of keys, storing paths as a single string would allow path lookups in a map trivially, and play nicer with go's escape analysis.
Simplest use case for my helper is:
type TestResult struct {
us string
i64 int64
i int
f float64
}
parser := MakeParser(make([]Parser, 32)[:0]).
Add(UnsafeStringValue(unsafe.Offsetof(TestResult{}.us)), "us").
Add(Int64Value(unsafe.Offsetof(TestResult{}.i64)), "i64").
Add(IntValue(unsafe.Offsetof(TestResult{}.i)), "i").
Add(Float64Value(unsafe.Offsetof(TestResult{}.f)), "f")
var results TestResult
err := RunParser([]byte(`{
"us": "some string",
"i64": "32"
"i": "64"
"f": "3.14"
}`), parser, unsafe.Pointer(&results))