[markdown rendering issue] stick words with italic is not working
Hi iliakan. I'm participating in the javascript.ko project. By the way, I found markdown rendering issue. It may only occur in Korean documents. See below screenshot.

Korean sometimes has to stick words together. I Hope this helps fix the issue.
Right now italic requires spaces to the both sides of *. So in expressions like 5*2 the star is not considered a special markup character.
I can tweak this rule, but tell me how? We don't want the star * to be mistreated in other situations.
Also, is this a real problem? Can you rephrase Korean?
@Violet-Bora-Lee what you think?
Until now, I've added a whitespace in order to avoid this problem. It is wrong spell inserting whitespace actually.
Is it easy to implement? If not, I could add some guide on the README file.
Hi,
The question is deeper than one might think.
Some time ago we rewrote our parser to base it on https://github.com/markdown-it/markdown-it, that implements the CommonMark specification.
The italic/bold thing is handled by that parser, according to the spec.
I decided to see what the CommonMark spec says about that case.
At https://commonmark.org/help/tutorial/02-emphasis.html I entered *마크다운(script)*렌더링 (copied-pasted arbitrary Korean chars, otherwise I have no Korean on my keyboard).
And it remained *마크다운(script)*렌더링 (as is), the CommonMark didn't convert it.
So if we want it to be converted, we need to tell the guys who make the CommonMark specification about it. Then they can hopefully update the spec, and then the parser updates too, so everyone's happy.
I suggest going to https://talk.commonmark.org/ and making a topic there, such as "BUG in Korean" and then describe the issue. Maybe there's a way out.
P.S. Please note: the problem occurs only if I put "(script)" in the phrase. Maybe ) has something to do with it.
This is not a bug, but an unfortunate (and arguably bad) design choice made by CommonMark. (See the spec.)
tldr: use <em>스크립트(script)</em>라고 or, if you really want to use Markdown syntax, *스크립트(script)*​라고.
To parse nested emphases such as *emphasized **(strong)** text* efficiently (i.e., without having to looking for pairing delimiters), CommonMark parses */** as an opening or closing delimiter by heuristics, using a preceding character and a following character of each delimiter.
Unfortunately, the heuristics is far from being perfect.
| Markdown | Rendering |
|---|---|
**super-*wo*-man** |
super-wo-man |
*super-*woman |
*super-*woman |
(Example drawn from https://github.com/commonmark/commonmark-spec/issues/643)
The * in -*w is always parsed as an opening delimiter, which makes *super-*woman rendered "incorrectly." (However, this behavior is an intended behavior of CommonMark. It is a part of the spec.)
In practical English or European text, however, the case like *super-*woman is almost nonexistent; one would naturally use *super*-woman instead. However, in CJK text, the unintended side effect of the above heuristics is a real issue, and it has been reported to the CommonMark side several times: link 1(Japanese), link 2(Chinese) since at least 2016.
More regrettably, it seems that the development of CommonMark has been stagnated, so the best bet right now is to use one of the following workarounds:
-
Use zero-width space (ZWSP, U+200B) character
​:*스크립트(script)*​라고If you wish, you may use the more descriptive equivalent​or the hex representation​. Note that, although it is invisible to humans, a zero-width space character is rendered to HTML as a Unicode whitespace character, which is not desirable in text searching, etc. -
Use a raw HTML tag:
<em>스크립트(script)</em>라고(Markdown supports inline HTML tags.)
Personally, I would use the second option.