Parsing of HTML in markdown possible?
Hi,
I would like to allow the possibility of inlining HTML in Markdown. I understand you disable this for security reason I guess but can it be made configurable somehow?
I changed the code in nextjournal/markdown/parser.clj at line 415 with:
(defmethod apply-token "html_block" [doc {inlined-html :content}] (push-node doc {:type :text :text inlined-html}))
and it works great. What do you think? Thanks for that well-designed and very useful lib. Jérémie.
Hi @jgrodziski,
glad you find it useful :-).
We didn't have the need for inline html so far. If needed for rendering purposes, I guess a user could add apply-token method implementations directly in their projects (as you do above) and add renderer functions in the hiccup conversion context under appropriate types, to be used as in
(nextjournal.markdown.transform/->hiccup
(assoc nextjournal.markdown.transform/default-hiccup-renderers :html your-fn)
markdown-data)
My usecase is for a static site generator. I've recently changed my blogs to use nextjournal.markdown instead of markdown-clj (rationale in this Mastodon thread) and it's working great... except it broke most images on my blog, because they appeared as <img> tags (rather than the native Markdown syntax for images) in the Markdown sources of the existing posts.
I'd say the current behaviour makes nextjournal.markdown violate the CommonMark spec. If security is the reason, I'd still make the parser emit HTML nodes by default, but have them ignored in the ast->hiccup transformer.
BTW, thank you for the fantastic library! :)
Hi @nathell
I'd still make the parser emit HTML nodes
Right, that shouldn't harm.
Since it's been asked again, here's a temporary solution until we'll handle html internally.
(ns scratch.markdown-html
(:require [nextjournal.markdown :as md]
[nextjournal.markdown.parser :as md.parser]))
(defmethod md.parser/apply-token "html_inline" [doc {html-content :content}]
(md.parser/push-node doc {:type :html-inline :text html-content}))
(defmethod md.parser/apply-token "html_block" [doc {html-content :content}]
(md.parser/push-node doc {:type :html-block :text html-content}))
(md/parse "# HTML Handling
<img src=\"https://www.example.com/image1.jpg\" alt=\"High-Efficiency Antenna\">
some <span class='gorgeous'>text</span> inlined
<aside>this is valid commonmark</aside>
")
;; =>
{:toc {:type :toc,
:children [{:type :toc,
:content [{:type :text, :text "HTML Handling"}],
:heading-level 1,
:attrs {:id "html-handling"},
:path [:content 0]}]},
:footnotes [],
:content [{:type :heading,
:content [{:type :text, :text "HTML Handling"}],
:heading-level 1,
:attrs {:id "html-handling"}}
{:type :html-block,
:text "<img src=\"https://www.example.com/image1.jpg\" alt=\"High-Efficiency Antenna\">\n"}
{:type :paragraph,
:content [{:type :text, :text "some "}
{:type :html-inline, :text "<span class='gorgeous'>"}
{:type :text, :text "text"}
{:type :html-inline, :text "</span>"}
{:type :text, :text " inlined"}]}
{:type :html-block, :text "<aside>this is valid commonmark</aside>\n"}],
:type :doc,
:title "HTML Handling"}