bnfc icon indicating copy to clipboard operation
bnfc copied to clipboard

A tree-sitter backend

Open wenkokke opened this issue 4 years ago • 7 comments

If we create a tree-sitter backend, we could get basic editor support for all languages using BNFC with almost no work.

Tree-sitter grammars are the defacto way of implementing highlighting in Atom, and there are packages which use tree-sitter grammars to provide highlighting in VSCode, neovim, and emacs. There are also bindings to use a tree-sitter grammar from Java provided by JetBrains, which would help with integration into the Jetbrains editor ecosystem.

There are bindings for tree-sitter are in various languages, including Haskell, JavaScript (both Node.js and Wasm), OCaml, Python, Ruby, and Rust.

Compiling a BNF grammar to tree-sitter should be fairly straightforward, and the only major hurdle I foresee would be to implement support for layouts, which would require some custom C code. For an example of how to implement this, one could look at the grammars for Agda, Haskell, Python, or any other language with layout rules.

wenkokke avatar Oct 19 '21 12:10 wenkokke

This would solve #193, by virtue of the fact that there are tree-sitter bindings for Rust.

wenkokke avatar Oct 19 '21 12:10 wenkokke

Perhaps @banacorn could offer their advice, since they wrote the parser for Agda? It seems that their scanner.cc could be used as-is for top-level layout rules, with only minor adjustments needed to support layout start and stop keywords.

wenkokke avatar Oct 19 '21 12:10 wenkokke

If I understand this correctly, BNFC would have to produce either a grammar.js file to be processed by the tree-sitter CLI, or directly a .json file.

I wonder whether this could be factored into BNFC -> (E)BNF -> tree-sitter. There is:

  • (archived) https://github.com/eatkins/tree-sitter-ebnf-generator
  • anything else?

I wonder how well the Haskell bindings are maintained:

  • [ ] https://github.com/tree-sitter/haskell-tree-sitter/issues/298

Btw, there is apparently a tree-sitter grammar for LBNF: https://github.com/MortenSchou/tree-sitter-lbnf. Since BNFC is boot-strapped, it could then create this grammar itself.

andreasabel avatar Oct 19 '21 17:10 andreasabel

I'm not sure if there's any advice I can offer or how much of help I can be 👀 It'd be nice not having to translate those tree-sitter grammar by hand anyway!

banacorn avatar Oct 20 '21 14:10 banacorn

I wonder whether this could be factored into BNFC -> (E)BNF -> tree-sitter

Wouldn’t this lead to problems supporting layouts?

wenkokke avatar Oct 20 '21 21:10 wenkokke

I wonder whether this could be factored into BNFC -> (E)BNF -> tree-sitter

Wouldn’t this lead to problems supporting layouts?

Yes, this wouldn't support layout, so maybe it is not worth looking into it, unless BNFC/-layout -> (E)BNF has its own interest.

andreasabel avatar Oct 21 '21 10:10 andreasabel

Hi, I have implemented a preliminary tree-sitter backend in #471. If any of the correspondents on this thread are interested, please help me test it out, thank you!

chaserhkj avatar Nov 28 '23 20:11 chaserhkj