Remove invalid elements from <head> when serializing
It's fairly common for some advertising or tracking scripts to inject content into the head such as iframes or images. iframes are allowed in head elements if they themselves only contain meta content elements, but otherwise anything that is not a meta content element is not allowed in a head element.
When re-rendering, browsers insert an implicit </head> before invalid content which pushes any following elements, meta content or not, into the body. Usually resulting in a broken page.
We currently remove all iframes from the head since they do not influence the page visually. There is a very short list of allowed elements in headers. https://developer.mozilla.org/en-US/docs/Web/HTML/Element/head#See_also
Should we start by removing images and eventually expand that to include other common elements? Or should we iterate through head elements and remove any that are not explicitly allowed?
@wwilsman I think we took care of this in CLI, right?
Not entirely. iFrames are removed from head elements but we don't yet prune invalid content