HumanSD icon indicating copy to clipboard operation
HumanSD copied to clipboard

LAION-Human Dataset

Open unrealMJ opened this issue 2 years ago • 4 comments

Hi,

I have already downloaded the full laion-5b dataset. How can i use your .parquet and mapping file to get corresponding image.

unrealMJ avatar Nov 11 '23 02:11 unrealMJ

Also, the .parquet has 2.86M images, while the mapping.json has 1M images, it seems that is a subset of .parquet. I'd like to ask for the details about .parquet, i think is a subset of laion-5b, how do you get it?

unrealMJ avatar Nov 11 '23 03:11 unrealMJ

Hi, @unrealMJ ! Thank you for your focus. You may use python utils/download_data.py to download all images. The .parquet has provides images in Laion-Aesthetic since we have a different order with the original Laion-Aesthetic dataset as mentioned in issue4.

juxuan27 avatar Nov 11 '23 13:11 juxuan27

Hi, thanks for your reply. The Laion2b-en-aesthetic in huggingface has 52.1M rows, but the .parquet you provided only has 2.86M rows, i'd like to ask the difference.

unrealMJ avatar Nov 12 '23 02:11 unrealMJ

The .parquet we provide is a subset of Laion2b-en-aesthetic, filtering out the part with a higher aesthetic score.

juxuan27 avatar Dec 27 '23 09:12 juxuan27