rust-code-analysis Implement metrics for Java code

Currently no metrics have been implemented for the Java code

Nov 16 '20 16:11 Luni-4

As part of fixing this, we can write some docs on how to add a new language.

Nov 16 '20 17:11 marco-c

I would like to pick this up but not too sure what it entails. I have been looking through the implementations and think I understand some of it. I have only been learning Rust for a few weeks so it might take a while but since this has been open for almost a year I imagine it is not urgent. Eventually, I would like to add Go and C# but it appears Java has some implementation already so I thought this would be a good place to start.

I could start with investigating the different elements that need implementing and documenting those if it does not exist already? This could for the basis of documentation mentioned.

Oct 12 '21 07:10 dburriss

Hello! Thanks for your help! :)

Well, for now Java has only some utilities functions, as you can see in the checker.rs file. I think your strategy is pretty good:

Look for what we need to implement for Java or any other missing language you prefer. As an example, I'm showing you the Loc implementation for the Python language https://github.com/mozilla/rust-code-analysis/blob/master/src/metrics/loc.rs#L184 We have used a trait because the Loc metric can be computed differently according to the chosen language. Instead here we have added a test to check if our implementation is correct.
Pick up a metric (we can start with Loc) and construct a markdown file that explains how to implement that metric for a generic language. In this file, we also need to specify to add tests to verify our implementation
Follow up the instructions contained in the markdown file and implement the metric

Oct 12 '21 08:10 Luni-4

Entirely possible I am missing something here and this all exists but the metrics seem to rely on some sort of language grammar being generated? This is done by adding the necessary tree_sitter lib to rust-code-analysis/tree/master/enums/src and adding it to the necessary language call and macro? This seems implemented for Java so not a big deal but I would like to circle around eventually and tackle Go (seems to be some mentions in the codebase) and C# (nothing).

I imagined it is going to be fairly iterative.

A few docs I can imagine:

High level how-to on language going over the different moving parts that need implementing
Metrics Reference (point to existing docs)
Tutorial style implementation (here is one I did for Farmer https://compositionalit.github.io/farmer/contributing/adding-resources/ )

With the rough sketch of how things work done here, I write it up so I am explaining it back to you to be sure I understand. Then I implement and come back and flesh out the docs with more detail as I learn.

Of course with enough help I am sure we could document it all up front, it just was not how I imagined doing it when I started poking around.

Something I probably need to look into is more on how Rust generates its docs.

Is that far off what you imagined?
P.S. I am going to be AFK for 2 weeks so will pick this up when I am back online.

Oct 13 '21 20:10 dburriss

This sounds fine, steps 2 and 3 @Luni-4 was mentioning should happen at the same time (while implementing the metric, you write down what you are doing and at the end you reorganize what you wrote in nice docs).

@Luni-4 can you explain @dburriss how to add a new tree-sitter grammar?

Oct 14 '21 07:10 marco-c

@marco-c Sure!

@dburriss If you want to add a new tree-sitter-grammar to rust-code-analysis you need to:

Verify either upstream or in some other repository whether the grammar is available for the current tree-sitter version used in this repository (in our case 0.19)

For the enums crate part:

Add your grammar crate here
Add your language here
Add your language below this line
Launch the recreate-grammars.sh to recreate all grammars

For the rust-code-analysis part:

Add your grammar here
Add your grammar to the mod here
Define your grammar here

At this point, you should be able to compile rust-code-analysis with the grammar. After that you can implement the metrics for that language using the steps provided above.

I completely agree with your docs, I think they could be pretty helpful for the new contributors, especially the tutorial style. We can add them into our official docs that are all defined here

If you need some help, feel free to ask anytime

Oct 14 '21 12:10 Luni-4

I wanted to check some assumptions:

for (int i = 0; i < 100;  i++) {
  System.out.println(i);
}

For LLOC, the value should be 4? 3 for the statements inside the for and 1 for the println? I think that matches the Python behavior

Another check?

class HelloWorldApp {
  public static void main(String[] args) {
    System.out.println(\"Hello World!\"); // Display the string.
  }
}

How many LLOC does the above have? Put another way, do package NS, class declaration, and members count toward LLOC?

Currently, mine is counting 2 for the above but I think it should only be the 1 println?

P.S. I cannot set debug points inside the compute when running the tests anymore. I swear I could before yesterday. Unfortunately, pulling master required me to update rust version, and at the same time I managed to lose my IDE (trying CLion) setup when creating the 2 new branches. Did anyone experience anything like that with the new update?

Oct 26 '21 10:10 dburriss

I wanted to check some assumptions:
for (int i = 0; i < 100;  i++) {
  System.out.println(i);
}
For LLOC, the value should be 4? 3 for the statements inside the for and 1 for the println? I think that matches the Python behavior

Exactly, in this case LLOC is 4 due to the statements you have described

class HelloWorldApp {
  public static void main(String[] args) {
    System.out.println(\"Hello World!\"); // Display the string.
  }
}
How many LLOC does the above have? Put another way, do package NS, class declaration, and members count toward LLOC?

Currently, mine is counting 2 for the above but I think it should only be the 1 println?

First of all, we should ask: what are statements in Java?

I found this article from Oracle. In the case above, only the println should be counted as statement, so LLOC is 1.

P.S. I cannot set debug points inside the compute when running the tests anymore. I swear I could before yesterday. Unfortunately, pulling master required me to update rust version, and at the same time I managed to lose my IDE (trying CLion) setup when creating the 2 new branches. Did anyone experience anything like that with the new update?

Yep, we have updated master to the new Rust version such that the Rust 2021 edition, which adds interesting improvements, is enabled. Hmm, I don't have much experience with that IDE, could you try to reset your configuration in some way?

Oct 27 '21 07:10 Luni-4

Hi. Working on this again. I am having some trouble with something but want to check it is something I should be solving.

if (x == 1)

Is one logical line.

if ( (x == 1) || (x==2) )

I had assumed this was 2 logical lines but reading the link from @Luni-4 I am thinking I may have been wrong. Although it never talks about the semantics of a compound predicate, my understanding now is that it is a single expression and so a single logical line. Do you agree?

Feb 07 '22 06:02 dburriss

I believe it is to be considered a single lloc. I guess you can confirm with a C++ snippet, which should be exactly the same in this case.

Feb 07 '22 11:02 marco-c

#786 is adding support for cyclomatic complexity calculation for Java.

Feb 10 '22 15:02 marco-c

#694 added loc metrics.

Mar 23 '22 09:03 marco-c

#811 adds Halstead metrics

Apr 12 '22 06:04 dburriss

#823 Adds nom metrics

Apr 14 '22 15:04 dburriss

#822 Adds exit metrics

Apr 14 '22 15:04 dburriss

@marco-c

[x] Cognitive: PR #850
[x] Cyclomatic complexity: merged
[x] Exit: in PR #822
[x] Halstead in PR #811 with open queries
[x] Loc: merged
[x] MI: I don't think the implementation is necessary. Automatic. What about test?
[x] Nargs: I don't think the implementation is necessary. Automatic. Should add a test.
[x] Nom: in PR #823 . merged

Apr 14 '22 15:04 dburriss

@dburriss @Luni-4 anything else left before closing this?

Jun 29 '23 15:06 marco-c

@marco-c It's not directly related to the metric calculation but the Halstead ops check is still giving unexpected values #849

Jun 29 '23 16:06 dburriss

OK, let's close this and consider that as a separate follow-up bug that we have to fix.

Jun 29 '23 20:06 marco-c