[v6.6.7] HTML being parsed and re-written breaking <source> inside <video> or <picture> elements. (Workaround / Fix inside)
I'm one of these old farts that was around when validating your HTML was cool enough to warrant putting a badge on your site, so still validate my code, and noticed an issue yesterday with our static site. If you write the following HTML:
<video>
<source src="/media/examples/flower.webm" type="video/webm">
<source src="/media/examples/flower.mp4" type="video/mp4">
<p>Sorry, your browser doesn't support embedded videos.</p>
</video>
So if your browser cannot play the webm video, it goes to the next element, and tries to play the mp4 video, if it can't do this it goes to the next element, which in this case is just a message saying your browser doesn't support embedded videos.
But, after WP2Static has parsed it with DOMDocument(), you will end up with:
<video><source src="/media/examples/flower.webm" type="video/webm"><source src="/media/examples/flower.mp4" type="video/mp4"><p>Sorry, your browser doesn't support embedded videos.</p>
</source></source></video>
Which if you add the line-breaks and tabs in, it is clearer why this doesn't work.
<video>
<source src="/media/examples/flower.webm" type="video/webm">
<source src="/media/examples/flower.mp4" type="video/mp4">
<p>Sorry, your browser doesn't support embedded videos.</p>
</source>
</source>
</video>
Now this isn't correct. What this leads to is if your browser cannot play the webm video, it has nothing to try next, it ignores everything inside <source>, as <source> is not a tag that wraps other tags, technically, </source> isn't a tag, it will fail validation, which is how I discovered this issue.
Fix / Workaround
End your <source> elements XHTML style like this: <source />
<video>
<source src="/media/examples/flower.webm" type="video/webm" />
<source src="/media/examples/flower.mp4" type="video/mp4" />
<p>Sorry, your browser doesn't support embedded videos.</p>
</video>
This isn't required by the HTML5 specification, nor is it encouraged as it could lead you to assume the element can wrap or have a separate closing element, but it does resolve the issue and instead of ending up with nested
<video>
<source src="/media/examples/flower.webm" type="video/webm"></source>
<source src="/media/examples/flower.mp4" type="video/mp4"></source>
<p>Sorry, your browser doesn't support embedded videos.</p>
</video>
This doesn't make your static HTML validate, but browsers should hopefully be able to try each source in order falling back to the next one if they can't play it.
Picture Element
Same workaround applies to the <picture> element if using srcset, close the <source> tags XHTML style like this: <source />
<picture>
<source srcset="logo-768.png 768w, logo-768-1.5x.png 1.5x" />
<source srcset="logo-480.png, logo-480-2x.png 2x" />
<img src="logo-320.png" alt="logo">
</picture>
References
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/video https://developer.mozilla.org/en-US/docs/Web/HTML/Element/picture
Hi @lukearmstrong, many thanks for reporting this and providing workarounds!
Invalid/unparsable HTML was one of the most unnecessary blockers for V6.x users, so I've taken that out of the core for upcoming V7. It will reappear in the form of the Advanced HTML Processor add-on, which will provide functions to strip WP footprints and anything other transformations requiring HTML parsing. It's still a way off from a testable build, as I'm focusing on the core functions at the moment, with simple string replacement working for the majority of users.
When the time comes to implement that and close this issue off for the new add-on, I may implement something to address this - perhaps a separate addon to force adjusting the markup of these in the dev site or to transform during processing...
Thanks again!
@lukearmstrong I've added some unit testing around parsing in recently and will add these example inputs in and work on a fix for it