Stable Diffusion Dataset
When we were trying to download the data set using the script laion400m-filtered-download-images.sh, we were getting an error that the source directory doesn't exist. Specifically, below command is failing, "rclone copy mlc-training:mlcommons-training-wg-public/stable_diffusion/datasets/laion-400m/moments-webdataset-filtered/ ${OUTPUT_DIR} --include="*.tar" -P"
Hi, were you able to get around it
@ahmadki both me and @nathanw-mlc tested the rclone commands and the scripts, and the data exists in the bucket (see attached)
What I did notice is that even the original scrips assumes that the destination directory /datasets/etcetc can be created, but unless the user is root, they won't have permissions to do so. Maybe this is the reason why it fails?
Hi,
I have used the same commands but still observing the same issue
Can those having issues please share the result of rclone version.
Here is the version
I just noticed that the update to the Dockerfile uses apt-get install to install Rclone. This install method installs an old version of Rclone (rclone v1.53.3-DEV) that doesn't process the rclone config create command correctly, resulting in Rclone attempting to connect to an AWS S3 bucket with the provided credentials, rather than a Cloudflare R2 bucket. Users need to be running v1.6x.x. To make that happen, the Dockerfile should install Rclone with the install command we provide for all Rclone instructions: sudo -v ; curl https://rclone.org/install.sh | sudo bash
It worked for me, other users might have to clean the config files, before retrying with new rclone version
@ahmadki can we fix Dockerfile with @nathanw-mlc sugguestion sudo -v ; curl https://rclone.org/install.sh | sudo bash?
I genuinely dislike piping scripts from the internet into bash. Not only does it pose a security risk, but we also need to freeze rclone to a specific version.
https://github.com/mlcommons/training/pull/757 should work better.
Closing because #757 is merged.