training icon indicating copy to clipboard operation
training copied to clipboard

Stable Diffusion Dataset

Open amasin2111 opened this issue 1 year ago • 9 comments

When we were trying to download the data set using the script laion400m-filtered-download-images.sh, we were getting an error that the source directory doesn't exist. Specifically, below command is failing, "rclone copy mlc-training:mlcommons-training-wg-public/stable_diffusion/datasets/laion-400m/moments-webdataset-filtered/ ${OUTPUT_DIR} --include="*.tar" -P"

amasin2111 avatar Jun 28 '24 18:06 amasin2111

Hi, were you able to get around it

amasin2111 avatar Jul 05 '24 13:07 amasin2111

@ahmadki both me and @nathanw-mlc tested the rclone commands and the scripts, and the data exists in the bucket (see attached)

What I did notice is that even the original scrips assumes that the destination directory /datasets/etcetc can be created, but unless the user is root, they won't have permissions to do so. Maybe this is the reason why it fails?

rclone-1 rclone-2

morphine00 avatar Jul 12 '24 19:07 morphine00

Hi, I have used the same commands but still observing the same issue 1 2

amasin2111 avatar Jul 17 '24 15:07 amasin2111

Can those having issues please share the result of rclone version.

nathanwasson avatar Jul 17 '24 15:07 nathanwasson

Here is the version version

amasin2111 avatar Jul 17 '24 16:07 amasin2111

I just noticed that the update to the Dockerfile uses apt-get install to install Rclone. This install method installs an old version of Rclone (rclone v1.53.3-DEV) that doesn't process the rclone config create command correctly, resulting in Rclone attempting to connect to an AWS S3 bucket with the provided credentials, rather than a Cloudflare R2 bucket. Users need to be running v1.6x.x. To make that happen, the Dockerfile should install Rclone with the install command we provide for all Rclone instructions: sudo -v ; curl https://rclone.org/install.sh | sudo bash

nathanwasson avatar Jul 17 '24 16:07 nathanwasson

It worked for me, other users might have to clean the config files, before retrying with new rclone version

amasin2111 avatar Jul 17 '24 16:07 amasin2111

@ahmadki can we fix Dockerfile with @nathanw-mlc sugguestion sudo -v ; curl https://rclone.org/install.sh | sudo bash?

hiwotadese avatar Aug 01 '24 15:08 hiwotadese

I genuinely dislike piping scripts from the internet into bash. Not only does it pose a security risk, but we also need to freeze rclone to a specific version.

https://github.com/mlcommons/training/pull/757 should work better.

ahmadki avatar Aug 01 '24 15:08 ahmadki

Closing because #757 is merged.

ShriyaRishab avatar Aug 02 '24 15:08 ShriyaRishab