HWI icon indicating copy to clipboard operation
HWI copied to clipboard

relax str input restriction to accept also bytes

Open fametrano opened this issue 5 years ago • 1 comments

(1) There is some confusion about base64.b64encode and base64.b64decode, mostly originating from Python2.

Anyway, in Python3 b64decode accepts both str and bytes-like and returns bytes:

Python 3.8.6 (default, Nov 24 2020, 12:38:16) 
>>> from base64 import b64encode, b64decode
>>> b64decode('test')
b'\xb5\xeb-'
>>> b64decode(b'test')
b'\xb5\xeb-'

Similarly, there is no good reason to limit the input of Psbt.deserialize to just str.

(2) Returning a str from Psbt.serialize can be streamlined removing the redundant HexToBase64 function and the binascii dependance. I also clarified that the .decode is .decode('ascii').

Last, but not least, in Python3 b64encode accepts bytes-like and returns bytes, not str:

>>> b64encode(b'\xb5\xeb-')
b'test'
>>> b64encode('\xb5\xeb-')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ferdi/.pyenv/versions/3.8.6/lib/python3.8/base64.py", line 58, in b64encode
    encoded = binascii.b2a_base64(s, newline=False)
TypeError: a bytes-like object is required, not 'str'

Consequently, b64encode(b64decode(*)) is the identity function only for bytes:

>>> b64encode(b64decode('test'))
b'test'
>>> b64encode(b64decode(b'test'))
b'test'

This makes clear that bytes are the main input for b64encode/b64decode, reinforcing the case for (1), with only b64decode gracefully accepting str and performing an .encode() on it before handling it as bytes.

On a related note, I would also suggest to reserve serialize/deserialize for the binary representation of the PSBT, using something like b64encode/b64decode for the actual Base64 encoding/decoding. Anyway, this would not be backward compatible, so I didn't do it

fametrano avatar Dec 09 '20 22:12 fametrano

On initial look, this seems to make the API a little bit more confusing because if deserialize can accept str or bytes, then I would expect that bytes is already base64 decoded. I think it would be better to have a deserialize_base64 function and then have deserialize operate on the actual bytes.

On a related note, I would also suggest to reserve serialize/deserialize for the binary representation of the PSBT, using something like b64encode/b64decode for the actual Base64 encoding/decoding. Anyway, this would not be backward compatible, so I didn't do it

Since we're already planning on being backwards incompatible, I think you should do that.

achow101 avatar Jan 19 '21 16:01 achow101