Header anchors [needs discussion]
It's very popular request to add header ancors. Prior to do it, we need to discuss possible security problems and solutions.
Read first
- https://github.com/jch/html-pipeline/pull/111#issuecomment-34369984 - discussion about github implementation
- http://www.slideshare.net/x00mario/in-the-dom-no-one-will-hear-you-scream - awesome presentation about dom clobbering
Problems:
-
id- collisions -
name- dom clobbering - cross-conflicts when multiple docs on the same page have the same headers
Possible solutions
- Do nothing
- unsafe, you need to control content, or site will be vulnerable
- Add prefix
- will require js to keep references work
- without js will make manual references typing not convenient
- does anyone type such way?
- not a problem for autogenerated tocs (we can add prefix to both anchors and refs)
- Add per-doc unique prefix
- not convenient in use. required in very limited cases
Need to discuss better solutions, and what to do by default, because anchors are really needed
current status
- Must have not empty default prefix
-
options.anchorPrefix- instance default.env.anchorPrefix- every-time-render override
-
- Open questions:
- default prefix name?
-
-or--of no objections (short and easy to type)
-
- should we autofix local relative anchor links? any bad side effects?
- default prefix name?
add default perfix to make id unique? like prefixId="mdit-"
# this is h1 => id="mdit-this-is-h1"
Use can set perfixId to empty string, got id="this-is-h1"
It's the most possible solution. Don't know how many expected things it will break. Also adds a second question - should we try to autofix links with anchors only (add prefix) or not.
What an about an option for the prefix, which is on by default? So if anyone doesn't want their anchors autofixed, they can turn it off?
I create a locally implementation for zeke/marky-markdown/pull/6
What an about an option for the prefix, which is on by default? So if anyone doesn't want their anchors autofixed, they can turn it off?
Yes. According to http://www.slideshare.net/x00mario/in-the-dom-no-one-will-hear-you-scream id/name without prefix will add unlimited vectors for attacs. No prefix by default would be mad, even with sanitizers. So, we have 2 more questions to answer:
- default prefix name? (something neutral, easy to type)
- Should we try to automatically fix links to anchors (add prefix if not exists) or not? Will such magic add side effects? Is is really required?
How about follow the github way? github use user-content- perfix
If we add such perfix to links too, those will look ugly. I'd prefer something like - or --.
Updated current status in the first post. If you know other people, interested in solution and having practice with this topic - invite them to join. Correct and convenient resolution is important.
I don't have enougth usage stat, and don't wish to push all with my opinion.
/cc @jonathanong
@jonathanong old friend...
/cc @dead-horse
Make it a plugin and don't worry about it? Haha
I'm okay with a custom prefix though
IMHO it should be a plugin. And, it needs some options (how it will generate IDs). I propose to support 3 variants of IDs generation. Assume we have markdown text:
# First heading
## Nice, second heading
## Another "nested" heading
# Here we go again
First variant: slugify. For the markdown above it will generate IDs like this:
#first-heading
#nice-second-heading
#another-nested-heading
#here-we-go-again
Second is actually slugify with prefix option (default: no prefix):
#my-prefix-first-heading
#my-prefix-nice-second-heading
#my-prefix-another-nested-heading
#my-prefix-here-we-go-again
Third one (I would go with this one by default):
#section-1
#section-1.1
#section-1.2
#section-2
Also, prefix option actually can affect both variants. So I imagine optiosn of a plugin like this:
{
style: "slugify", // possible values: "slugify", "numerify"
prefix: "section-" // default: no prefix
}
IMHO it should be a plugin. And, it needs some options (how it will generate IDs).
I think, that's unnecessary overcomplication. That will add more complexity than whole renderer function override. User can always do it, it (h|sh)e doesn't like default id generation method.
Though i can add Renderer.anchorify() for easy override.
Looks like it will be better to add postfix instead of prefix. I think, it's better to rename option to .anchorAdd (more neutral):
-
null- no anchors (idea only, is it needed at all?) -
-- default
Possible transformation logic:
- take original content of header (from next inline token)
- toLowerCase, trim & replace
/[\s\]\[\!\"\#\$\%\&\'\(\)\*\+\,\.\/\:\;\<\=\>\?\@\\\^\_\{\|\}\~\-]/gwith- - add
.anchorAddto tail - encodeURI (replace + encode - for non-english languages support)
- put to
nameattribute (duplicatednameare allowed). Or still better to use id?
id attribute better.
@fengmk2 could you explain? name is unusual for you? There should be no difference, when you click anchor link.
name- dom clobbering
this problem can be fixed?
this problem can be fixed?
Only with adding prefix / postfix to value
http://talk.commonmark.org/t/feature-request-automatically-generated-ids-for-headers/115/39?u=vitaly
## Episode 1
### Scene 1
### Scene 2
## Episode 2
### Scene 1
### Scene 2
Seems we have to care about deduplication.
Status of this issue is still unclear. No news on CommonMark forum.
I tend to do nothing in flavour of 4.+ release, where renderer will allow easy attributes addition. Something like:
header_token.attrs.push([ 'id', my_custom_id])
Good news is, that next milestone will start very soon.
IDEA: idTemplate (prefix) can be defined as user_custom_% or user_custom_%_%%. Then % will be replaced by content, and %% will be replaced by unique doc id (if needed). Such stupid microtemplating will not cost much, and remove restrictions on format.
This is how I wound up implementing the feature in one of my projects, but it doesn't support duplicate header contents yet:
https://github.com/danielgtaylor/aglio/blob/olio-theme/src/main.coffee#L43-L49
I'd say there is definitely a real need for such a feature.
@danielgtaylor in v4+ you can push new attr, without full renderer override. See updated doc
As i said, i have no objection about this feature. Problem is in final conventions (including deduplication logic) and commonmark spec. Since that's not urgent (because easy to customize), i'd prefer to wait.
I didn't read the entire thread but my 2 cents are: make something simple with ids. For a quick implementation on a website, I used dashified title + incremented value https://github.com/putaindecode/putaindecode.fr/blob/7c1e08ddf9f9d5d25dd0a877259d718ef57a3f69/scripts/marked.es#L77-L114
Result: http://putaindecode.fr/posts/entreprendre/auto-entrepreneuriat-retour-experiences/ You will see a lot of title with the same name & counter right after it. Works great in the case (simple & stupid ^^)
@MoOx it's not a big problem to "code something", but if we talk about core feature or plugin in this org, it's important to care about security and all possible use cases.
When you do custom solution for your project - you have less requirements, because you know exactly what happens in your environment. Custom modification is not a problem, because markdown-it was specially designed for that.
There is a plugin for this now, which ads IDs and can add anchors (links) to either side of the heading text:
https://github.com/valeriangalliat/markdown-it-anchor
@danielgtaylor , I wonder, why ids, and not names (as in <a name="">)? I'm worried that ids could potentially collide with other ids on the page used by javascript.
Also, nitpick: usage should start with const md = require('markdown-it')(), so readme is missing ().
You have two other plugin choices as well: https://github.com/leff/markdown-it-named-headers https://github.com/adam-p/markdown-it-headinganchor
FYI, I just published this plugin https://github.com/MoOx/markdown-it-toc-and-anchor
11147