In monorepo `dvc exp remove -A` should remove only experiments within the sub-dir project scope
Summary / Background
I'm testing DVC Experiments for monorepo scenario. I encountered unexpected behavior for dvc exp remove
Within a mono repo say we have:
- /
- project_a
- project_b
- root_content
When working inside project_a, I list experiments
dvc exp list
and get 2 experiments I ran for project_a
main:
5a4cba8 [paled-flus]
81c2de9 [tippy-scut]
Then, I want to remove all experiments for project_a with
dvc exp remove -A
DVC removes all experiments for all projects in the repo
Removed experiments: 'finer-limb', 'perdu-vase', 'paled-flus', 'heady-mate', 'older-tipi', 'dusty-tang', 'alpha-gyms', 'downy-kiwi', 'moved-bomb', 'butch-iglu', 'silly-fibs', 'olive-roam', 'bosom-curb', 'unlet-soja', 'tippy-scut', 'sassy-dawn', 'braky-baby', 'coaly-kill', 'moldy-moot', 'pappy-gest', 'split-dogs', 'elite-bort', 'shock-dado', 'boozy-bade', 'pucka-thaw', 'mossy-jird', 'splay-tosh', 'famed-afro', 'finer-torc', 'bijou-yolk', 'fetid-mope', 'tangy-trio', 'legal-ludo', 'cagey-sech', 'addle-chic', 'eerie-barb', 'noisy-rods', 'sarky-joey', 'older-jest', 'umber-tote', 'sable-moit', 'aging-doge', 'puffy-esse' and 'power-harl'
Expected behavior
dvc exp remove -A should remove only experiments within project_a scope
Thanks for the report!
DVC CLI does not do any monorepo slicing at the moment -- that is limited to Studio. The reason you only saw 2 experiments in dvc exp list is because you did not include -A for that command. All dvc exp commands will look at the entire repo.
Looks like https://github.com/iterative/dvc/issues/10244#issuecomment-1899063451 is related.
I didn't notice at first that these are entirely different dvc projects with project_a/.dvc and project_b/.dvc. @pmrowla Thoughts on how dvc exp commands should handle this scenario? See the link above, where it looks like we end up pushing two copies of each experiment from each project.
I'm not sure we are actually pushing two copies of the experiment, I'm guessing that there are actually two separate experiments with the same name being generated.
git refs (and exps) apply to the entire repository. If we actually need to separate exps by subrepo, then we need to extend the exp ref namespace to differentiate them in the dvc init --subdir case
so the exp refs would go somewhere like
refs/exps/<git-sha>/subrepo/path/to/subdir/exp-name