plugin-xml
plugin-xml copied to clipboard
Only treat space, `\t`, `\n`, and `\r` as whitespace
This adjusts the RegExp pattern used to trim text while formatting so that it only matches the four characters that are considered whitespace in the XML standard.
I added various invisible characters to the fixture to verify the fix. Please let me know if having them there in an unescaped form is acceptable because they might trigger warnings in some code editors. (Unfortunately, I don't think there's any way to escape them in such a way that the bug is covered by tests.) The full list of non-standard invisible characters that the fixture now contains is:
- U+00A0 No-Break Space
- U+1680 Ogham Space Mark
- U+2000 En Quad
- U+2001 Em Quad
- U+2002 En Space
- U+2003 Em Space
- U+2004 Three-Per-Em Space
- U+2005 Four-Per-Em Space
- U+2006 Six-Per-Em Space
- U+2007 Figure Space
- U+2008 Punctuation Space
- U+2009 Thin Space
- U+200A Hair Space
- U+2028 Line Separator
- U+2029 Paragraph Separator
- U+202F Narrow No-Break Space
- U+205F Medium Mathematical Space
- U+3000 Ideographic Space
- U+FEFF Zero Width No-Break Space
Fixes #789.