MagicSetEditor2 icon indicating copy to clipboard operation
MagicSetEditor2 copied to clipboard

Switch to UTF-8 strings internally

Open twanvl opened this issue 4 years ago • 0 comments

Currently MSE uses wxString everywhere. This makes it hard to interact with any other library.

The proposal is to switch to use utf8 encoded std:string for all strings internally, and only convert to wxString for the UI functions.

Pros:

  • Less dependence on wxWidgets
  • We can use a move constructor

Cons:

  • Lots of work for little gain

Issues:

  • wxString has a stupid implicit conversion constructor that uses the current locale instead of always utf8. Solution: make a wrapper class.
  • We have to be careful about broken strings (ending in the middle of a utf8 codepoint). Solution: validate input upon reading a file.
  • We need an iterator that walks over codepoints.
  • boost::regex doesn't actually work with utf8 strings, so "." would match a byte and not a code point, which is very bad. Solution: Use boost::basic_regex<UniChar>, and codepoint iterators.
  • Should we be using C++20 std::u8string?

twanvl avatar Jul 05 '21 23:07 twanvl