Replace Buffer with Uint8Array
I still need to do some cleanups and replace uses of Buffer.compare, but I'm putting this PR here so you can benchmark it. Before I can complete this, #428 (minus the version bump) should be merged in, as the services code uses Buffer everywhere and expects fixed/bytes types to decode to Buffers.
I've made some tweaks to the benchmarking setup, so comparing this directly to the master branch won't work:
- I've added an
ArrayFloatbenchmark to go along withArrayDouble. - I use the
--expose-gcflag when benchmarking in order to manually trigger garbage collection between benches, hopefully making results more consistent. - I've changed the length distribution of strings to be exponentially weighted, so shorter strings are still likely but longer strings will now be occasionally generated. The previous code was only benchmarking the manual path of
Tap#writeString, since it only generated strings up to a length of 32.
I've cherry-picked those benchmarking changes into the bench-tweaks branch, which you can use to compare benchmarks.
Thanks @valadaptive! I'll try to find time to merge #428.
FYI @valadaptive - #428 is in.
Working on removing Buffer usage from types.js now. I noticed the isJsonBuffer function, which seems to check if a given object is the JSON representation of a Buffer. Under what circumstances are Avro types directly serialized to JSON and/or parsed back directly? I can't easily polyfill Uint8Array to stringify to a regular array, so I'll probably need to insert a fixup step when stringifying/parsing.
@mtth Do you intend to support the ability to roundtrip various Avro types to/from JSON? With the current representation that uses Buffers, this works mostly fine, but with Uint8Array, the JSON representation is a lot more bloated:
> JSON.stringify(Buffer.from([1, 2, 3, 4, 5]))
'{"type":"Buffer","data":[1,2,3,4,5]}'
> JSON.stringify(new Uint8Array([1, 2, 3, 4, 5]))
'{"0":1,"1":2,"2":3,"3":4,"4":5}'
I lean towards removing the coerceBuffers option entirely to discourage people from serializing Uint8Arrays to JSON.
Do you intend to support the ability to roundtrip various Avro types to/from JSON?
It's nice to have, but I'm OK dropping coerceBuffers if it adds significant complexity.
It's nice to have, but I'm OK dropping
coerceBuffersif it adds significant complexity.
I believe it works right now in terms of recognizing Buffers. However, since we now use Uint8Arrays instead of Buffers in types, they'll serialize to much larger objects in JSON (e.g. Uint8Array([1, 2, 3, 4, 5] is serialized as '{"0":1,"1":2,"2":3,"3":4,"4":5}').
I can either try to implement code to recognize JSON'd Uint8Arrays, or remove coerceBuffers entirely. Leaving it as-is would be a huge footgun, because you can no longer round-trip Avro types through JSON (it'll serialize Uint8Arrays to a JSON representation it cannot itself recognize as a Uint8Array).
Short reference to https://gist.github.com/joscha/d8603a1f0af5b0b055546c792b2a8ff6, which can be updated once this pull request lands.