PGA downloading a higher number of siva files than expected
I want to download all siva files with "Jupyter Notebook" on PGA.
To know how many they are, I ran:
$ pga list --lang "Jupyter Notebook" -f csv
After examining the csv file, I knew that there were 2,606 repos and 3,767 siva files corresponding to them.
To download the siva files, I ran
$ pga get --lang "Jupyter Notebook" -v
And the response that I got was:
DEBU[0004] local copy is outdated or non existent
1 / 6349 [>----------------------------------------------------------] 0.02% 40m59s
Meaning that it was downloading 6,349 files, and I have no idea why. If somebody can help me with this.
I was investigating this today and found out that there are 3,295 siva files for the repo https://github.com/google/skia-buildbot.
So, it was my mistake, pga get IS downloading the exact number of siva files, however I'm intrigued on this extreme number of siva files for one repo. Is it normal?
@gomesfernanda When you clone the repository with standard refspecs you will obtain something like that:
$ git clone [email protected]:google/skia-buildbot.git
Cloning into 'skia-buildbot'...
remote: Enumerating objects: 3543, done.
remote: Counting objects: 100% (3543/3543), done.
remote: Compressing objects: 100% (2652/2652), done.
remote: Total 108260 (delta 1931), reused 1807 (delta 598), pack-reused 104717
Receiving objects: 100% (108260/108260), 51.61 MiB | 398.00 KiB/s, done.
Resolving deltas: 100% (77333/77333), done.
It contains just a few branches and only 4 root commits:
$ git rev-list --all --remotes --max-parents=0 | wc -l
4
But if you fetch using the same refspec that was used to fetch that repository using Borges:
$ git checkout origin/master
$ git fetch origin +refs/*:refs/*
remote: Enumerating objects: 60611, done.
remote: Counting objects: 100% (60611/60611), done.
remote: Compressing objects: 100% (13044/13044), done.
receiving objects: 53% (82299/155281), 60.64 MiB | 1.14 MiB/s
[...]
$ git rev-list --all --remotes --max-parents=0 | wc -l
5734
That means, right now that repository will be on 5734 different siva files.
This is because they are using Gerrit. Gerrit generates a new orphan branch per each "pull request".