getBodyHtml only returns first part of Mail
Hi,
we've noticed the following bug:
In some cases $message->getBodyHtml() only returns the first part of a multipart/mixed message. The Sender (or Apple) has inserted inline Images between multiple HTML Parts. It is possible to receive all attachments using $message->getAttachments() (even though they aren't really inline).
The rawMessage is attached (I've shortened the base64 decoded attachments): rawMessage.txt
Kind regards Kevin
Hi, would you be able to reproduce the bug in a test case please?
Hi, I'm not so sure what you need from me. Can you explain a little more please? Thanks.
Would you be so kind to create a Pull-Request redacting a test case that uses your message to expose the bug you found? For example, this test case was redacted to reproduce a bug in handling GBK charset:
https://github.com/ddeboer/imap/blob/208a29fd4bb2c3240a10ebfb51cc84f5a16dad51/tests/MessageTest.php#L1052-L1059
Sorry, I'm completely new to Github. I have no Idea how I am supposed to do that.
Hi,
I've created the pull request #454 . If I've missed something please let me know.
Kind regards Kevin
I've a message that return a blank html, it has 2 attachments. Does this help? Campaña Caser Salud Otoño-Invierno 2020-2021, ¡te esperan grandes premios!.zip
Easy to reproduce: When I create an email in Apple Mail and drag an PDF file in the middle of my email (so with text around) and send it, the getBodyHtml only returns the first part of the e-mail.
I fixed it with the method:
private function getAllHtml(\Ddeboer\Imap\Message|\Ddeboer\Imap\MessageInterface $message)
{
$iterator = new \RecursiveIteratorIterator($message, \RecursiveIteratorIterator::SELF_FIRST);
$htmlParts = [];
foreach ($iterator as $part) {
if ($message::SUBTYPE_HTML === $part->getSubtype()) {
$htmlParts[] = $part->getDecodedContent();
}
}
if (count($htmlParts) === 1) {
return $htmlParts[0];
}
if (count($htmlParts) > 1) {
$newDom = new DOMDocument();
$newBody = '';
$newDom->loadHTML(mb_convert_encoding(implode('', $htmlParts), 'HTML-ENTITIES', 'UTF-8'));
$bodyTags = $newDom->getElementsByTagName('body');
foreach ($bodyTags as $body) {
foreach ($body->childNodes as $node) {
$newBody .= $newDom->saveHTML($node);
}
}
$newDom = new DOMDocument();
$newDom->loadHTML(mb_convert_encoding($newBody, 'HTML-ENTITIES', 'UTF-8'));
return $newDom->saveHTML();
}
// If message has no parts and is HTML, return content of message itself.
if ($message::SUBTYPE_HTML === $this->getSubtype()) {
return $this->getDecodedContent();
}
return null;
}
Merge body tags: https://stackoverflow.com/questions/60163537/save-multiple-html-bodies-as-one-using-domdocument
@arwinvdv would you be so kind to propose a PR with your fix?