popcode icon indicating copy to clipboard operation
popcode copied to clipboard

Popcode should error when <ol> nested in <p>

Open outoftime opened this issue 7 years ago • 5 comments

N.B. Be sure to read this comment for discussion of the rather subtle nature of this problem / scope of the solution


Submitter: Xavier Email: [email protected] Popcode does not detect when a paragraph tag is not closed before the beginning of an ordered list.

Here is the input text in popcode:

Text A

  1. item 1
Text B In the preview iframe, chrome tries to correct this malformed HTML like this:

Text A

  1. item 1
Text B

Meaning "Text B" does not render as a paragraph. Chrome version: Google Chrome 70.0.3538.76 (Official Build)

outoftime avatar Dec 04 '18 21:12 outoftime

The formatting is a bit off, but I don't see that there is a bug here. When I write the following code:

        <p>Text A
        <ol>
            <li>Item 1</li>
        </ol>
        Text B

I get the following error:

On line 12:
Your <p> tag needs to be closed by a </p> tag.

wylieconlon avatar Dec 04 '18 21:12 wylieconlon

@wylieconlon I think the report is correct—check this snapshot:

<!DOCTYPE html>
<html>
    <head>
        <title>Page Title</title>
    </head>
    <body>
        <p>
        Text before list
        <ol><li>List item</li></ol>
        Text after list
        </p>
    </body>
</html>

The Text after list, in the DOM, is not enclosed in a paragraph tag at all. Also, because this isn’t valid HTML (<p> only permits phrasing content), the DOM definitely looks quite different from the structure implied by the HTML. I agree that this should be an error in Popcode.

outoftime avatar Dec 04 '18 21:12 outoftime

Is it worth validating other cases of invalid parent-child relationships in HTML, such as not putting block tags inside inline tags? (like <div> inside <a>)? If this is a bug, then those cases are also bugs.

This is making me think that we could extract the HTML linting from popcode and make it its own repo

wylieconlon avatar Dec 05 '18 22:12 wylieconlon

Is it worth validating other cases of invalid parent-child relationships in HTML, such as not putting block tags inside inline tags? (like <div> inside <a>)? If this is a bug, then those cases are also bugs.

Agreed, and I think the answer is yes, as long as the violation is not purely academic—as it’s not in the original case, as demonstrated by adding a little CSS which transparently does not work as one would expect.

There’s already some enforcement of parent-child rules but obviously it’s not comprehensive. Not sure if we could squeeze more out of htmlInspector or if we’d need to fold it into our hand-rolled parser.

This is making me think that we could extract the HTML linting from popcode and make it its own repo

You mean runRules and friends? Definitely agree, that’s always been the long-term vision for that module.

outoftime avatar Dec 08 '18 17:12 outoftime

Actually having poked around a bit more, this is subtly different from what it looks like, and I actually think the scope is a bit narrower.

This is not actually a problem with invalid structure per se; the problem is that the <ol> tag implicitly closes the <p> tag. This is documented behavior for the <p> tag; I can’t find any other (commonly used) tags that behave this way.

So what we’re seeing is not the browser throwing up its hands and trying to make sense of invalid HTML (except for the </p> later in the document, which does not actually pair with the opening <p> and thus generates the empty paragraph node we see in the DOM).

With that in mind, I think the answer to both questions above is clearer:

  • This does not need to generalize to a rule of “enforce all valid parent/child rules in HTML”, because this isn’t actually a case of that. What we need is a rule that says “you’re not allowed to implicitly close paragraph tags”.
  • We certainly need to deal with this in our custom HTML structure validator, because that’s the only HTML validator that examines the raw structure of the HTML as written. Once the invalid HTML is run through a parser, the “<ol> implicitly closes <p> tag” rule will have already been applied, so it’ll be impossible to distinguish from a perfectly acceptable HTML string that produces the same DOM structure.

outoftime avatar Dec 08 '18 17:12 outoftime