commonmark-java icon indicating copy to clipboard operation
commonmark-java copied to clipboard

ThematicBreak literal is lost in the Markdown renderer

Open jumale opened this issue 1 year ago • 1 comments

Corresponding to the specification, thematic breaks can consist of 3 or more consecutive on of characters -, _ or * with 0-3 leading spaces (i.e. regex ^\s{0,3}[-_*]{3,}$). This logic seem to be correct when reading Markdown - the thematic breaks are correctly recognised and captured as ThematicBreak node with literal parameter containing the actual value from the Markdown. However, during rendering the literal is dropped and replaced with static ___

Expected behaviour: this markdown looks the same after parse/render

foo

   *******

bar

---

baz

Actual behaviour: rendered transforms it into

foo

___

bar

___

baz

jumale avatar Jul 01 '24 07:07 jumale

Yeah. MarkdownRenderer does not yet preserve everything from the input, it's main focus is on producing an equivalent document (___ is also a thematic break).

Would you want to try to make that change yourself and raise a PR? Looks like you've already found the right place where the fix should go :). I think in this case, if the node has a literal, we can use it, otherwise use ___ to not be ambiguous with lists.

robinst avatar Jul 01 '24 13:07 robinst