jaydebeapi icon indicating copy to clipboard operation
jaydebeapi copied to clipboard

BINARY types are decoded to UTF-8 on python3

Open jmacdonald-Ocient opened this issue 5 years ago • 3 comments

The _to_binary converter currently passes the output of getObject to the string constructor:

def _to_binary(rs, col):
    java_val = rs.getObject(col)
    if java_val is None:
        return
    return str(java_val)

With python3 this causes the binary java_val to be decoded with UTF-8 which can cause loss of resolution if the binary value contains non-unicode values (some bytes are replaced with the replacement character efbfbd).

Would you be open to a PR that disables the decoding if not PY2 or alternately an option on the connection or cursor?

jmacdonald-Ocient avatar Jun 03 '20 16:06 jmacdonald-Ocient

Thanks @jmacdonald-Ocient , sure I'm open to such a PR. Please also provide tests, ideally in test_mock or if you can't get your head around this at least a test in test_integration.py.

baztian avatar Jun 03 '20 18:06 baztian

Would it be sensible to return a bytearray from _to_binary?

psukys avatar Sep 14 '20 12:09 psukys