core icon indicating copy to clipboard operation
core copied to clipboard

[EPIC] [MVP] Improvements to Thoth advises output

Open mayaCostantini opened this issue 3 years ago • 5 comments

Problem statement

As a Python Developer, I would like to have concise information about the quality of my software stack and all its transitive dependencies, so that I get some absolute metrics such as:

  • "95% of my dependencies are maintained with a dependency update tool (i.e. dependabot, etc)"
  • "45% of my dependencies have 3 or more maintainers"
  • ...

Which would be aggregated and compared to metrics for packages present in Thoth's database to provide a global quality metric for a given software stack, eventually given a specific criteria (maintenance, code quality...), in the form of a percentage or score (A, B, C...).

We consider the metrics derived from direct and transitive dependencies to be of the same importance, so there will not be any difference in the weight given to information carried by the two types of dependencies.

Proposal description

  1. create ADR wrt/ implementation of the service as 'a bot' eg GitHub App, Action, ... ?
  2. PoC: Implement an experimental thamos flag on the advise command to give users insights about the maintenance of their packages
  • [x] https://github.com/thoth-station/thamos/issues/1149
  • [ ] https://github.com/thoth-station/thamos/issues/1148
  1. Compute metrics for packages present in Thoth's database that will serve as a basis for a global software stack quality score

Taking the example of OSSF Scorecards, we already aggregate this information in prescriptions which are used directly by the adviser. However, the aggregation logic present in prescriptions-refresh-job only updates prescriptions for packages already present in the repository. We could either aggregate Scorecards data for more packages using the OSSF BigQuery dataset or have our own tool that computes Scorecards metrics on a new package release, which could be integrated directly into package-update-job for instance. This would most likely consist in a simple script querying the GitHub API and computing the metrics on the project's last release commit.

  • [ ] https://github.com/thoth-station/core/issues/440
  • [ ] https://github.com/thoth-station/storages/issues/2668
  1. Schedule a new job to compute metrics on aggregated data
  • [ ] Implement the job to be run after each package-update-job or on a regular schedule
  • [ ] Compute percentiles for each metric that will serve as a basis to score a software stack
  1. Implement the global scoring logic

For example, if a software stack is in the 95th percentile of packages with the best development practices (CI/CD, testing...), score it as "A" for this category. Compute a global score from the different category scores.

  • [x] https://github.com/thoth-station/core/issues/442
  • [ ] Implement this logic either on the adviser side by performing a lookup to the database when an advise and integrating these metrics in the advise report, or on each endpoint separately if we wish to separate information carried by metrics from advise reports.
  • [ ] Make the scoring logic publicly accessible via justification URLs provided with each scoring

Additional context

Actionable items If implemented, those improvements will most likely be a way for maintainers of a project to show that they use a trusted software stacks to their users. AFAICS, this would not provide any actionable feedback to developers about their dependencies.

Acceptance Criteria

To define.

mayaCostantini avatar Jul 27 '22 10:07 mayaCostantini

@mayaCostantini: This issue is currently awaiting triage. If a refinement session determines this is a relevant issue, it will accept the issue by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sesheta avatar Jul 27 '22 10:07 sesheta

/sig user-experience /priority important-soon

mayaCostantini avatar Jul 27 '22 10:07 mayaCostantini

One of the requirements for computing software stack quality scores based on OSSF Scorecards would be to have Scorecards data linked to each project latest release instead of the project repository head commit SHA. This feature request has already been proposed on the scorecards project side.

What about helping them implementing this feature and improving the scorecards cronjob directly instead of computing this data on our side?

cc @goern

mayaCostantini avatar Aug 01 '22 14:08 mayaCostantini

This sounds reasonable.

Nevertheless, we would use the data via big query?

goern avatar Aug 01 '22 14:08 goern

This sounds reasonable.

Nevertheless, we would use the data via big query?

Yes, but the information will already be computed in the dataset and we will not need to associate the head commit SHA to the release ourselves.

mayaCostantini avatar Aug 01 '22 14:08 mayaCostantini