BINARY types are decoded to UTF-8 on python3
The _to_binary converter currently passes the output of getObject to the string constructor:
def _to_binary(rs, col):
java_val = rs.getObject(col)
if java_val is None:
return
return str(java_val)
With python3 this causes the binary java_val to be decoded with UTF-8 which can cause loss of resolution if the binary value contains non-unicode values (some bytes are replaced with the replacement character efbfbd).
Would you be open to a PR that disables the decoding if not PY2 or alternately an option on the connection or cursor?
Thanks @jmacdonald-Ocient , sure I'm open to such a PR. Please also provide tests, ideally in test_mock or if you can't get your head around this at least a test in test_integration.py.
Would it be sensible to return a bytearray from _to_binary?