Martin Hofmann
Martin Hofmann
Would [`flex`][flex] provide a better trade-off between convenience and amount of generated source code? (IIRC `flex` uses tables to represent DFAs, not thousands of `if () goto` lines.) [flex]: https://github.com/westes/flex...
While I find the ~30000 lines (~400 KB) sized `scanners.c` a bit hefty, the tools do handle it easily. I once looked into a linker map file and found that...
Specifying the "result" of parsing and interpreting a _CommonMark_ input text **not** in the form of an output HTML text is certainly a good idea. The specification should instead describe...
Is "Unicode scalar value" the same as "code point of a Unicode _character_" then? It is my understanding that hi and lo surrogate code points in the BMP are _not_...
I don't understand this sentence: > No Unicode scalar values are exactly the Unicode code points minus the surrogate characters, the latters being a hack to be able to encode...
> Concretely what most programmers are interested in when they are dealing with text interchange are scalar values, that's what they have to process, decode and encode to UTF-X formats....
> This doesn't happen if you have a proper UTF-X decoder API. Again you will never get surrogates code points out of an UTF-X decoding process. I never doubted that....
> just note that "the legal characters of Unicode" is not a concept you can find formally defined in the Unicode Standard and would thus make the common mark spec...
I'm still waiting for your explanation of what exactly is "broken", "undefined", "ambiguous", "without precise meaning" in the given definition for _Char_, which obviously _is_ what "legal Unicode characters" is...
Looks good so far! > Here's one tricky issue that came up. Ideally, one would leave entities alone in link titles, rather than converting them to characters, at least if...