peps icon indicating copy to clipboard operation
peps copied to clipboard

PEP 756: Hypothetical CPython Unicode changes

Open vstinner opened this issue 1 year ago • 5 comments


📚 Documentation preview 📚: https://pep-previews--3987.org.readthedocs.build/

vstinner avatar Sep 23 '24 12:09 vstinner

@serhiy-storchaka @zooba: What do you think? Is it worth it to discuss these two "hypothetical Unicode changes", UTF-8 and UTF-16?

vstinner avatar Sep 23 '24 12:09 vstinner

@serhiy-storchaka @terryjreedy: Do you recall when/where using UTF-16 was discussed?

vstinner avatar Sep 23 '24 12:09 vstinner

We ought to start by explaining that we assume that our internal representation will change in the future, and we are deliberately not constraining any changes with this API. Then we could take those hypothetical changes and use them as examples of "if we changed X, then this API would still work if it's used like ...". That shows that the API design is robust (if people use it right), and so is suitable for the limited API. We're just trying to provide an optimisation here, since there are already stable APIs that will behave properly all the time.

The important thing is for us to not get derailed by discussing hypothetical futures. People love to bikeshed about stuff like that, and it's only a distraction here. We need to keep the focus on "this API design is flexible, and won't have to change even if we change its results".

zooba avatar Sep 23 '24 12:09 zooba

For 2.x, UTF-16 was more or less what was used on Windows (or UCS-2 with surrogates? I forget such details) and UFT-32 ~= UCS-4 elsewhere. I am not aware of any proposal to use XXX-16 everywhere. I would just say that that new constants will be added if needed.

terryjreedy avatar Sep 24 '24 01:09 terryjreedy

Last I heard (a while back) Jython implicitly uses UTF-16 as its internal representation, because the Java string type does, and IronPython running on .NET Framework probably does too (I believe .NET Core moved to UTF-8 internally though, and that's the one that matters these days).

So I doubt CPython would ever switch to it, but I wouldn't rule out alternative implementations wanting to use it. And a limited API candidate should take into account alternative implementations.

zooba avatar Sep 24 '24 13:09 zooba

I don't think that this change is still needed with https://github.com/python/peps/pull/3999.

vstinner avatar Sep 26 '24 14:09 vstinner