cmark icon indicating copy to clipboard operation
cmark copied to clipboard

Fix list tightness

Open taku0 opened this issue 2 years ago • 5 comments

This is a port of https://github.com/commonmark/commonmark.js/pull/269.

  • Set the end position precisely.
  • Check list tightness by comparing line numbers.
  • Remove LAST_LINE_BLANK and LAST_LINE_CHECKED flags.
  • Defer resolution of link reference definitions until list tightness is checked.

Comments for each commits (feel free to squash them):

  • Defer resolution of link reference definitions: We must not remove link reference definitions until we check for list tightness. This commit defers resolving of link reference definitions until finalization of the document. We still need to eagerly remove link reference definitions in setext headings to determine whether it is a setext heading or a thematic break. So this commit provides slightly different functions for resolving link reference definitions and checking if a line is blank for cmark_strbuf and cmark_chunk.

  • Remove CMARK_NODE__LAST_LINE_CHECKED flag: This flag was introduced by https://github.com/commonmark/cmark/issues/284, but we will not need it once we update S_ends_with_blank_line to not use resursion in the next commit.

  • Fix list tightness: This commit changes list tightness checking algorithm from one based on LAST_LINE_BLANK flag to one based on line numbers. This commit also set the end position precisely.

    Classification of end positions:

    • The end of the current line:

      • Thematic breaks
      • ATX headings
      • Setext headings
      • Fenced code blocks closed explicitly
      • HTML blocks (pre, comments, and others)
    • The end of the previous line:

      • Fenced code blocks closed by the end of the parent or EOF
      • HTML blocks (div and others)
      • HTML blocks closed by the end of the parent or EOF
      • Paragraphs
      • Block quotes
      • Empty list items
    • The end position of the last child:

      • Non-empty list items
      • Lists
    • The end position of the last non-blank line:

      • Indented code blocks

    The first two cases are handed by finalize and closed_explicitly flag.

    Non empty list items and lists are handled in switch statements in finalize.

    Indented code blocks are handled by setting the end position every time non-blank line is added to the block.

Benchmark:

  • master branch: mean = 0.1560, median = 0.1550, stdev = 0.0070

  • this branch: mean = 0.1610, median = 0.1600, stdev = 0.0032

taku0 avatar Aug 17 '23 05:08 taku0

Actually, this branch set end columns more accurately than https://github.com/commonmark/commonmark.js/pull/269 for corner cases. I will update commonmark.js when this PR is merged.

taku0 avatar Aug 17 '23 05:08 taku0

make leakcheck fails but it also fails on the master branch with same errors.

taku0 avatar Aug 17 '23 05:08 taku0

Excellent! What errors is make leakcheck failing with? We routinely run it as part of CI and it doesn't fail there. EDIT: Ah, I see the failures in CI. But the last commit from master succeeds on that same check...

jgm avatar Aug 18 '23 23:08 jgm

Sorry, I forget to make after switching the branch. I'll investigate the leak.

taku0 avatar Aug 19 '23 00:08 taku0

I have fixed the leak and the CI is now all green.

taku0 avatar Aug 19 '23 01:08 taku0