marked icon indicating copy to clipboard operation
marked copied to clipboard

Extension not triggering

Open nrivard opened this issue 1 year ago • 5 comments

Marked version:

Describe the bug I have created an extension to try and detect runs of newlines but neither source function nor tokenizer function ever seem to get called.

I am using like:

export class Markdown {
  #marked = new Marked().use({
    extensions: [newlineExtension]
  });

  // other stuff in my class
}

To Reproduce

  1. Define your extension as follows:
export const newlineExtension: TokenizerExtension = {
    name: 'newline',
    level: 'block',
    start(src) {
        return src.match(/[\n]/)?.index
    },
    tokenizer(src, tokens) {
        const match = src.match(/^[\n]{2,}$/);
        if (match && match.length > 0) {
            return {
                type: 'newline',
                raw: match[0]
            }
        }
        return undefined;
    }
};
  1. Pass your extension on to marked:
export class Markdown {
  #marked = new Marked().use({
    extensions: [newlineExtension]
  });

  // other stuff in my class
}
  1. Pass the following string in:
const multiLine = `
First line





Second line
`

In this case I am still getting space with raw: '\n\n\n\n\n\n\n' as the token type.

Expected behavior I would expect my extension function(s) to get called a single newline token to get created

nrivard avatar Aug 23 '24 13:08 nrivard

Thank you for the detailed issue!

It looks like the issue is $ in your matching regex. That will only match if the new lines are at the end of the string (if Second line line was removed) If you change the matching regex to /^[\n]{2,}/ it works.

UziTech avatar Aug 23 '24 14:08 UziTech

Ok but even when I change the regex, neither start nor tokenizer function ever seem to get called.

export const newlineExtension: TokenizerExtension = {
  name: 'newline',
  level: 'block',
  start(src) {
    return src.match(/\n/)?.index;
  },
  tokenizer(src, tokens) {
    const match = src.match(/^[\s]*\n[\s]*\n[\s]*/);
    if (match) {
      return {
        type: 'newline',
        raw: match[0]
      };
    }
    return undefined;
  }
};

export class Markdown {
  #marked = new Marked().use({
    extensions: [newlineExtension]
  });

  constructor() {}
  
  parse(src: string): AnyComponent {
    // these are custom
    const renderer = new MarkdownRenderer();
    const parser = new MarkdownParser({renderer: renderer});

    const tokens = this.#marked.lexer(src, {async: false, gfm: true});
    const components = tokens.length ? parser.parse(tokens) : [];
    return components.length ? <Column>{components}</Column> : <Blank />;
  }
}

One wrinkle though is that I'm only using the lexer...I have a custom parser and custom renderer objects that return something other than a string so I'm calling the above class like:

const output = Markdown.parse(str)

Is it possible tokenizer is never invoked if i'm just using the lexer?

nrivard avatar Aug 27 '24 10:08 nrivard

I see, the options object that you pass to the lexer is the full options object that is used. It does not get combined with the default options that contains the extension (for legacy reasons). Since both of the options you set are defaults you can just not have the options object or pass an object combined with this.#marked.defaults

this.#marked.lexer(src);
// or
this.#marked.lexer(src, { ...this.#marked.defaults, async: false, gfm: true });

UziTech avatar Aug 27 '24 14:08 UziTech

We also just added the hooks.provideParser hook in v14.1.0 which should allow you to provide a parser that can return anything as an extension.

This should allow something like:

export const newlineExtension: TokenizerExtension = {
  name: 'newline',
  level: 'block',
  start(src) {
    return src.match(/\n/)?.index;
  },
  tokenizer(src, tokens) {
    const match = src.match(/^[\s]*\n[\s]*\n[\s]*/);
    if (match) {
      return {
        type: 'newline',
        raw: match[0],
      };
    }
    return undefined;
  },
};

const reactParser = {
  provideParser() {
    const renderer = new MarkdownRenderer();
    const parser = new MarkdownParser({ renderer });
    return (tokens) => {
      const components = tokens.length ? parser.parse(tokens) : [];
      return components.length ? <Column>{components}</Column> : <Blank />;
    };
  },
};

export class Markdown {
  #marked = new Marked().use({
    extensions: [newlineExtension],
    hooks: reactParser,
  });

  constructor() {}

  parse(src: string): AnyComponent {
    return this.#marked.parse(src, { async: false, gfm: true });
  }
}

UziTech avatar Aug 27 '24 15:08 UziTech

Wow I will give this a try, thank you!

nrivard avatar Aug 27 '24 15:08 nrivard