msgspec icon indicating copy to clipboard operation
msgspec copied to clipboard

Support encoding/decoding bytes to/from hex strings

Open iliastsa opened this issue 2 months ago • 2 comments

Description

It would be great if we decode hex strings to bytes (and similarly encode to them). For example:

def main():
    t = MyClass(data=b"abc")

    # Somehow configure the Struct field that it's a hex encoded byte string.
    # Right, the input expects a base64 encoded byte string.
    tt = decode(b'{"data":"0x616263"}', type=MyClass)

    assert t == tt

if __name__ == "__main__":
    main()

I know this is doable using a dec_hook/enc_hook, but at my job we are decoding a huge number of such fields and it's taking a non-trivial time of our data decoding -- moving this logic directly in the msgpsec library should provide a significant boost in performance.

iliastsa avatar Dec 03 '25 15:12 iliastsa

I believe this should be relatively straightforward. We could add another option to Meta to toggle this, e.g.

class MyClass(Struct):
    data: Annotated[bytes, Meta(bytes_as_hex=True)]

During encode, this would encode bytes objects as hex strings, and during decode, it would turn hex strings into a bytes object.

provinzkraut avatar Dec 03 '25 16:12 provinzkraut

That would be great! Maybe a Meta(bytes_format="base64" | "hex") is more clear.

iliastsa avatar Dec 04 '25 08:12 iliastsa