Add support for C-Sharp
I imagine we would use tree-sitter-c-sharp.
Since the Java metrics are growing I would like to add C# too. I imagine the implementations should be close to identical.
Good idea!
So folding in lessons learned from #694 I am starting with some research and definitions. Thankfully C# has a fairly clear definition of a statement as well as a comprehensive list of the different types:
A cursory look through seems to indicate, unsurprisingly, that it is much the same as Java. I will raise it if I find any weird difference.
Should I create an issue per metric to track this or link PRs here?
So bumping into my first challenge here.
On this repo currently tree-sitter version dependency istree-sitter = ">= 0.19, < 0.21".
The source Cargo.toml for tree-sitter-c-sharp is tree-sitter = "0.19".
On crates.io the tree-sitter version is 0.17.
I guess this is the reason I get:
expected struct
tree_sitter::Language, found a different structtree_sitter::Language
when trying to add tree-sitter-c-sharp as a dependency?
I assume the fix here is to ask @dcreager, the owner of the crate, if it can be updated?
Just published tree-sitter-c-sharp 0.19.1, which should unblock you if you're using tree-sitter 0.19.
If you're using tree-sitter 0.20, then I think you'll end up getting the same error, until https://github.com/tree-sitter/tree-sitter-c-sharp/pull/219 (which I just opened) is merged. Once that gets a :+1: review I'll merge it and cut a new release of tree-sitter-c-sharp with the more inclusive tree-sitter dependency.
Thanks so much @dcreager 🙏 I'll test it out tomorrow but I think it should work.
Should I create an issue per metric to track this or link PRs here?
Let's just use this one, maybe you can edit the first comment to add a list of metrics with checkmarks.
After adding tree_sitter_c_sharp to the nums crate I get the follwong compile error:
Compiling enums v0.0.1 (C:\Users\devon\GitHub\rust-code-analysis\enums)
error[E0308]: mismatched types
--> src\macros.rs:22:31
|
20 | pub fn get_language(lang: &Lang) -> Language {
| -------- expected `tree_sitter::Language` because of return type
21 | match lang {
22 | Lang::Java => tree_sitter_java::language(),
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected struct `tree_sitter::Language`, found a different struct `tree_sitter::Language`
|
::: src\languages.rs:3:1
|
3 | / mk_langs!(
4 | | // 1) Name for enum
5 | | // 2) tree-sitter function to call to get a Language
6 | | (Java, tree_sitter_java),
... |
15 | | (Javascript, tree_sitter_javascript)
16 | | );
| |_- in this macro invocation
|
= note: perhaps two different versions of crate `tree_sitter` are being used?
= note: this error originates in the macro `mk_get_language` (in Nightly builds, run with -Z macro-backtrace for more info)
I am not 100% sure what is going on here. Running cargo lock -d, tree-sitter version look like so:
- tree-sitter 0.19.3
- cc 1.0.72
- regex 1.4.6
- tree-sitter 0.20.2
- cc 1.0.72
- regex 1.4.6
- tree-sitter-ccomment 0.19.0
- cc 1.0.72
- tree-sitter 0.19.3
- tree-sitter-cpp 0.20.0
- cc 1.0.72
- tree-sitter 0.20.2
- tree-sitter-java 0.19.0
- cc 1.0.72
- tree-sitter 0.19.3
- tree-sitter-javascript 0.19.0
- cc 1.0.72
- tree-sitter 0.19.3
- tree-sitter-mozcpp 0.20.1
- cc 1.0.72
- tree-sitter 0.19.3
- tree-sitter-cpp 0.20.0
- tree-sitter-mozjs 0.19.0
- cc 1.0.72
- tree-sitter 0.19.3
- tree-sitter-javascript 0.19.0
- tree-sitter-preproc 0.19.0
- cc 1.0.72
- tree-sitter 0.19.3
- tree-sitter-python 0.19.0
- cc 1.0.72
- tree-sitter 0.19.3
- tree-sitter-rust 0.19.0
- cc 1.0.72
- tree-sitter 0.19.3
- tree-sitter-typescript 0.20.0
- cc 1.0.72
- tree-sitter 0.19.3
Then when I add tree-sitter-c-sharp = "0.19.1" to enums/Cargo.toml, the dependencies look like so:
- tree-sitter 0.19.3
- cc 1.0.72
- regex 1.4.6
- tree-sitter 0.20.2
- cc 1.0.72
- regex 1.4.6
- tree-sitter-c-sharp 0.19.1
- cc 1.0.72
- tree-sitter 0.19.3
- tree-sitter-ccomment 0.19.0
- cc 1.0.72
- tree-sitter 0.19.3
- tree-sitter-cpp 0.20.0
- cc 1.0.72
- tree-sitter 0.20.2
- tree-sitter-java 0.19.0
- cc 1.0.72
- tree-sitter 0.19.3
- tree-sitter-javascript 0.19.0
- cc 1.0.72
- tree-sitter 0.19.3
- tree-sitter-mozcpp 0.20.1
- cc 1.0.72
- tree-sitter 0.20.2
- tree-sitter-cpp 0.20.0
- tree-sitter-mozjs 0.19.0
- cc 1.0.72
- tree-sitter 0.19.3
- tree-sitter-javascript 0.19.0
- tree-sitter-preproc 0.19.0
- cc 1.0.72
- tree-sitter 0.19.3
- tree-sitter-python 0.19.0
- cc 1.0.72
- tree-sitter 0.19.3
- tree-sitter-rust 0.19.0
- cc 1.0.72
- tree-sitter 0.19.3
- tree-sitter-typescript 0.20.0
- cc 1.0.72
- tree-sitter 0.20.2
As you can see tree-sitter-mozcpp and tree-sitter-typescript have been bumped to using tree-sitter 0.20.2 instead of 0.19.3. Interestingly, tree-sitter-cpp was already on 0.20.2 but tree-sitter-mozcpp was not (but now it is).
So my understanding (with very little Rust experience) is that the enum crate is using the higher tree-sitter and then any of the lower ones are returning the wrong type. In the example above it is java but I confirmed this by moving others up to the top of the match. If I try constrain tree-sitter to 0.19 then typescript and cpp return the wrong type.
I actually ran into this issue trying to consume ``rust-code-analysisin another crate and ended up searching for a version of the crate where thetree-sitter` versions aligned.
So @marco-c I am going to need some help on how to resolve this:
- Why did this work before? It seems to be purely due to when the lock file was generated? Or the lock file was edited manually? I confirmed this by deleting the lock file on ~HEAD and indeed, building throws the same error.
- What is the alternative here? With my limited knowledge I guess it would be keeping the tree-sitter versions in sync and only upgrading in lock-step?
I see there is a ticket for upgrading tree-sitter to 0.20 #674 so maybe a known thing that this should be in lockstep?
Sorry about the wall of text if this is all know stuff but it helped me attempt to reason through it.
To try confirm the above I downgraded typescript and mozcpp and pinned the tree-sitter in enum to 0.19.3 and the enum project compiles again.
I think we should try to have all grammars use the same tree-sitter version, otherwise we have too many complications. We should either update all to 0.20, or revert all to 0.19. CC @Luni-4