python-gammu
python-gammu copied to clipboard
Can create invalid unicode strings
Many phones use surrogates to encode higher plane unicode chars and this gets passed through Gammu and python-gammu to Python unicode string. The problem is that these are not allowed there, do doing something with such string ends up in:
UnicodeEncodeError: 'utf-8' codec can't encode character '\ud800' in position 10316: surrogates not allowed
There has to be some bug in the surrogate conversion code:
https://github.com/gammu/python-gammu/blob/86a497c623b139df3819ed22d2763ff5aec76578/gammu/src/convertors/string.c#L121-L136
Or there is other way this can slip through. I've seen this in Text as returned by DecodePDU.