python-titlecase icon indicating copy to clipboard operation
python-titlecase copied to clipboard

Should non-breaking spaces engage in titlecasing?

Open robinwhittleton opened this issue 2 years ago • 3 comments

At the moment we split words based on tabs or a normal space character. This means that words following a non-breaking space don’t properly get titlecased. Example:

>>> from titlecase import titlecase as pip_titlecase
>>> pip_titlecase('mrs. test')
'Mrs. Test'
>>> pip_titlecase('mrs. test')
'Mrs.\xa0test'

Presumably the fix is as simple as adding a non-breaking space character to https://github.com/ppannuto/python-titlecase/blob/418c57ca6c7f324ddc2813b3fc88d52e84db63bd/titlecase/init.py#L103 (although it’s fair to say that I haven’t tested). Is this wanted? If so I’ll put a PR together.

robinwhittleton avatar Dec 26 '23 11:12 robinwhittleton

That makes sense to me; happy to take a PR.

ppannuto avatar Mar 20 '24 18:03 ppannuto

OK, first question: we presumably want to preserve the type of space used, but historically the code throws away whether it’s a tab or space separator and just joins them with a space: result = " ".join(tc_line). Given that I’d be changing existing behaviour If I update the code to preserve and rejoin with the original characters, would you want me to add that as a preserve_space_characters option?

robinwhittleton avatar Mar 23 '24 07:03 robinwhittleton

OK, it wasn’t much work so I ended up doing this in a PR anyway: https://github.com/ppannuto/python-titlecase/pull/97. If you’d rather that this is the default behaviour and doesn’t need a switch then it’s easy enough to remove and rebase.

robinwhittleton avatar Mar 23 '24 14:03 robinwhittleton

Closed by #97.

ppannuto avatar Apr 06 '24 05:04 ppannuto