[cmark --smart] paren-quote-markup combination
A plausible occurrence in a document is ("text"), which cmark --smart correctly turns into
(“text”)
However, if we emphasize the text,
("*text*")
we get
(”<em>text</em>”)
Note the incorrect right quote after the opening paren.
The same goes for the combination of a paren, a quote and other markup, such as strong emphasis and references:
("**text**") --> (”<strong>text</strong>”)
("[text]") --> (”<a href=...>text</a>”)
Adding some diagnostics
% ./build/src/cmark --smart
("*text*")
char = ", can_open = 0, can_close = 1
char = *, can_open = 1, can_close = 0
char = *, can_open = 0, can_close = 1
char = ", can_open = 0, can_close = 1
<p>(”<em>text</em>”)</p>
So the problem is that the opening " character is marked as can_close but not can_open.
Further investigation reveals
char = ", left_flanking = 1, right_flanking = 1
Now let's look at the logic at src/inlines.c l. 444:
} else if (c == '\'' || c == '"') {
*can_open = left_flanking && !right_flanking &&
before_char != ']' && before_char != ')';
*can_close = right_flanking;
So for a quote character, to be marked as can_open you have to be left flanking and not right flanking. In this case the " is both left and right flanking, so it isn't marked as can_open.
It's both left and right flanking because it's between two punctuation characters.
We may need to tweak the logic here and add more test cases.
This change fixes the issue:
diff --git a/src/inlines.c b/src/inlines.c
index e6b491f..fb7d2e4 100644
--- a/src/inlines.c
+++ b/src/inlines.c
@@ -439,8 +439,9 @@ static int scan_delims(subject *subj, unsigned char c, bool *can_open,
*can_close = right_flanking &&
(!left_flanking || cmark_utf8proc_is_punctuation(after_char));
} else if (c == '\'' || c == '"') {
- *can_open = left_flanking && !right_flanking &&
- before_char != ']' && before_char != ')';
+ *can_open = left_flanking &&
+ (!right_flanking || before_char == '(' || before_char == '[') &&
+ before_char != ']' && before_char != ')';
*can_close = right_flanking;
} else {
*can_open = left_flanking;
I'm not closing this yet, because we need to
- [ ] improve
test/smart_punct.txtwith much more accurate documentation of the rules, and further test cases - [ ] commit the code change above
- [ ] add issues to commonmark-hs and commonmark.js so that comparable changes can be made there