`vmime::text::createFromString()` drops spaces if an 8-bit word is followed by a 7-bit word

Open RichardSteele opened this issue 2 years ago • 1 comments

Following example

vmime::mailbox mailbox(vmime::text("Test München West", vmime::charsets::WINDOWS_1252), "[email protected]");
vmime::utility::outputStreamAdapter adapter(std::cout);
mailbox.generate(adapter);

gives as output

=?us-ascii?Q?Test_?= =?windows-1252?Q?M=FCnchen?= =?us-ascii?Q?West?=
  <[email protected]>

The first space between Test and München is encoded as an underscore along with the first word: Test_. The second space between München and West is encoded with neither of the two words and thus lost. Decoding the text results in Test MünchenWest instead of Test München West.

This is caused by the way vmime::text::createFromString() handles transitions between 7-bit and 8-bit words: https://github.com/kisli/vmime/blob/1a35bb6d71b6301287e21aaabd112997ea0f0a7f/src/vmime/text.cpp#L310-L346 If an 8-bit word follows a 7-bit word, a space is appended to the previous word (lines 321 - 324). The opposite case of a 7-bit word following an 8-bit word misses this behaviour (lines 339- 345). Pull request https://github.com/kisli/vmime/pull/284 fixes this.

Nov 08 '23 08:11 RichardSteele

A trivial fix is to just generate a single vmime::word — which is what most MUA (e.g. Alpine, Thunderbird) do anyway.

+++ b/src/vmime/text.cpp
@@ -273,6 +273,7 @@ void text::createFromString(const string& in, const charset& ch) {
 
        removeAllWords();
 
+#if 0
        // Check whether there is a recommended encoding for this charset.
        // If so, the whole buffer will be encoded. Else, the number of
        // 7-bit (ASCII) bytes in the input will be used to determine if
@@ -288,7 +289,9 @@ void text::createFromString(const string& in, const charset& ch) {
        // If there are "too much" non-ASCII chars, encode everything
        if (alwaysEncode || asciiPercent < 60) {  // less than 60% ASCII chars
 
+#endif
                appendWord(make_shared <word>(in, ch));
+#if 0
 
        // Else, only encode words which need it
        } else {
@@ -363,6 +366,7 @@ void text::createFromString(const string& in, const charset& ch) {
                        }
                }
        }
+#endif
 }

Apr 25 '24 22:04 jengelh