[FEATURE] A way to either automatically accept License terms for relevant datasets, or, a one liner that allows one to accept the License of multiple datasets
🚨🚨 Feature Request
- [ ] Related to an existing Issue
- [x ] A new implementation (Improvement, Extension)
Is your feature request related to a problem?
I noticed that when using datasets that require me to accept a license, I have to accept the same license multiple times for each one of the dataset's sets. For example, I have to separately accept imagenet-train, imagenet-val and imagenet-test terms before I can use them. Furthermore, to help our friends running things on headless machines one should be able to run a CLI command locally, querying the datasets that will be used in downstream experiments, such that they can agree to the terms before they even begin launching their experiments on their remote instances.
If your feature will improve HUB
Reduce redundant requirements to accept terms multiple times for the same dataset. Also, to reduce friction when people want to accept terms for multiple datasets before they launch on a remote (headless) machine.
Description of the possible solution
- Ensure that terms for a given dataset do not need to be accepted once for each subset of that dataset. e.g. accept once for all three subsets of imagenet.
- Create a CLI command where one can request to accept the license terms locally at that particular machine, such that when the datasets are required on a headless machine, they will be easily accessible.
- Ensure this can be found in the getting started documentation.
Hey @AntreasAntoniou Thanks for this feedback. We will store the acceptance of terms/conditions on a per-user basis, so you'll just have to login from the CLI and we'll know whether you accepted the terms. We'll also combine the terms for the different subsets of ImageNet.
OK perfect.
The only other thing that could be useful, is a single liner for CLI, perhaps something like:
hub accept-terms "dataset1" "dataset2" ...
Followed by a proper presentation of the terms, and a way to accept them one by one. This way we can have the users pay attention when they accept terms, without introducing friction on large-scale experiments.
Hey @AntreasAntoniou We deployed a fix so that agreement acceptance is permanently associated with a user. Currently, you still have to accept the agreements separately for imagenet-train/-test/-val, but those are the only datasets with agreements, so hopefully that's not too much of a hassle.
Apologies in the delay of the fix.