urlpack icon indicating copy to clipboard operation
urlpack copied to clipboard

[idea] compression with schema

Open cometkim opened this issue 4 years ago • 5 comments

If the schema of the message is fixed, more efficient compression is possible.

Keep the msgpack and build a treemap for shorter key names. It will need the same schema to restore original message.

cometkim avatar Aug 04 '21 05:08 cometkim

This is for compression, not validation. So it will probably need to be able to integrate with external validators.

The name json (with) schema is not appropriate because it is likely to be confused with the JSON Schema spec, which contains advanced features such as references or support for dynamic messages.

Maybe a name like fixjson? or other ideas

cometkim avatar Aug 04 '21 06:08 cometkim

@cometkim Do you think it uses extension type which in messagepack spec?

or is this package's compression logic executed and then encoded like this according to the message pack specification?

encode(json, schema);

kyoungduck avatar Aug 04 '21 06:08 kyoungduck

The schema can be bound in constructors, so there is no need to pass it as an argument.

And I don't think support for extensions will be needed as the schema user can make more assumptions about the message before using. A subset would be enough

One common rule is that efficiency and flexibility are usually trade-offs.

cometkim avatar Aug 04 '21 12:08 cometkim

An interesting idea is that this also can flatten nested structures beyond simply mapping key names.

Given that we currently have poor performance for complex structures, this could be a good breakthrough.

cometkim avatar Aug 04 '21 17:08 cometkim

Another idea: LZ-based compression for large strings (str8, str16, str32...)

cometkim avatar Oct 03 '21 16:10 cometkim