commonmark-spec icon indicating copy to clipboard operation
commonmark-spec copied to clipboard

Non-alphanumeric with format is not properly parsed when connected to an alphanumeric string

Open tomerlichtash opened this issue 1 year ago • 1 comments

String with non-alphanumeric formatted content which has a next-char of an alpha-numeric is tokenized as text node, instead of into a series of format nodes as expected.

Problem reproduced on CommonMark online demo (to reproduce just paste **@**A there and compare with **@** A).

Example: While all these samples are tokenize as expected:

**@**@ => formatted non-alphanumeric + non-alphanumeric
@**@** => non-alphanumeric + formatted non-alphanumeric
@**A** => formatted non-alphanumeric + non-alphanumeric
**A** @ => formatted alphanumeric + space + non-alphanumeric

This sample will be tokenized into a text node and will not be parsed: **@**A (formatted non-alphanumeric + alphanumeric)

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">

<document xmlns="http://commonmark.org/xml/1.0">
  <paragraph>
    <text>**</text>
    <text>@</text>
    <text>**</text>
    <text>A</text>
  </paragraph>
</document>

Add a space between formatted non-alphanumeric and alpha-numeric and compare tokenization for string **A** @:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">

<document xmlns="http://commonmark.org/xml/1.0">
  <paragraph>
    <strong>
      <text>@</text>
    </strong>
    <text> A</text>
  </paragraph>
</document>

tomerlichtash avatar Jul 22 '24 12:07 tomerlichtash

Are you claiming that the parser doesn't properly implement the spec, or are you suggesting a change to the spec? If the latter, please examine the current rules and be specific about the change you'd recommend, recognizing that any change that "fixes" this case may break other things.

Unfortunately, the way commonmark / Markdown is designed, it is difficult to avoid some "blind spots" like this. See my essay Beyond Markdown, item 1. My project https://djot.net attempts to fix some of these issues.

jgm avatar Jul 28 '24 16:07 jgm