jsonp-api icon indicating copy to clipboard operation
jsonp-api copied to clipboard

JsonParser should provide access to underlying UTF8 stream

Open jjspiegel opened this issue 5 years ago • 0 comments

This is an enhancement idea for the next version of JSON-P.

UTF-8 is the predominant representation for JSON text: From https://tools.ietf.org/html/rfc8259:

JSON text exchanged between systems that are not part of a closed ecosystem MUST be encoded using UTF-8

Currently JsonParser only exposes JSON strings as java.lang.String (see JsonParser#getString())

This conversion between UTF8 and java.lang.String is typically expensive in terms of processing and memory. And some consumers may be converting the JSON to other formats (e.g. a storage format for a database) in which case the java.lang.String value is immediately re-serialized anyway.

javax.xml.stream.XMLStreamReader provides methods like getTextLength() and getTextCharacters() that allow the consumer to avoid String materialization.

I think JsonParser could have similar methods that expose the underlying UTF-8 value. One option would be something like XMLStreamReader:

int getUTF8Length()
getUTF8(int sourceStart, byte[] target, int targetStart, int length);

Another option would be something like:

getString(OutputStream out)

Which writes the current VALUE_STRING event as UTF-8.

jjspiegel avatar Aug 26 '20 20:08 jjspiegel