Reimplement in terms of streaming instead of fixed slices
I know this is a huge PR, but this PR theoretically solves a major pain point I've been experiencing with very large messagepack values.
Instead of reading hundreds of megabytes into a single byte slice and decoding the entire thing all at once, this now offers an API to stream the messagepack value from storage and decode it on the fly.
Similarly, instead of encoding the entire struct into a single giant byte slice before shipping it off to storage, this allows the encoding to be done in a streaming fashion.
My application is running in an environment where it is difficult to have multiple copies in memory at the same time, and streaming will greatly alleviate memory pressure.
As it is, this PR should offer full compatibility with the previous implementation except for type extensions, the API of which had to be modified to support the new mode of operation.
I actually meant to create this as a PR against my own repo since I haven't thoroughly tested it against my use case yet, but... since github decided to open it here, I wrote up this description to make the purpose of these changes more clear.