python-gammu icon indicating copy to clipboard operation
python-gammu copied to clipboard

Can create invalid unicode strings

Open nijel opened this issue 5 years ago • 0 comments

Many phones use surrogates to encode higher plane unicode chars and this gets passed through Gammu and python-gammu to Python unicode string. The problem is that these are not allowed there, do doing something with such string ends up in:

UnicodeEncodeError: 'utf-8' codec can't encode character '\ud800' in position 10316: surrogates not allowed

There has to be some bug in the surrogate conversion code:

https://github.com/gammu/python-gammu/blob/86a497c623b139df3819ed22d2763ff5aec76578/gammu/src/convertors/string.c#L121-L136

Or there is other way this can slip through. I've seen this in Text as returned by DecodePDU.

nijel avatar Mar 11 '20 10:03 nijel