[GH->HF] Part 2: Remove all dataset scripts from github
Now that all the datasets live on the Hub we can remove the /datasets directory that contains all the dataset scripts of this repository
Needs https://github.com/huggingface/datasets/pull/4973 to be merged first and PR to be enabled on the Hub for non-namespaced datasets
The documentation is not available anymore as the PR was closed or merged.
So this means metrics will be deleted from this repo in favor of the "evaluate" library? Maybe you guys could just redirect metrics to that library.
We are deprecating the metrics in datasets indeed and suggest users to switch to evaluate (via a warning message)
We'll keep the current metrics as they are for now, but they'll be completely removed at one point
I guess this is ready to merge ?
It should break nothing except one rare case:
If someone is using an old version of datasets to try to load a recent dataset. Indeed in that case it fetches the main branch on github to see if it exists. But since we're removing all the datasets, forward fetching won't work anymore.
e.g. if someone uses "imagenet-1k" with a version of datasets that didn't have it at that time. I checked on kibana and one single user would be affected with 4k downloads/months. It should still work for them though thanks to the datasets cache
But if they delete their cache, the workaround is... 🥁 update datasets 😅
Let's merge this on monday if we can, to make sure contributors who wanted to merge their dataset PRs here could do it
Alright, merging !