Open-Assistant icon indicating copy to clipboard operation
Open-Assistant copied to clipboard

Add the CosIng dataset (cosmetic ingredients)

Open nmeln opened this issue 2 years ago • 0 comments

CosIng is an open EU dataset of cosmetic ingredients.

I think it's what websites like CosDNA or INCIdecoder are using with a few tweaks for better explainability. Link to dataset (CSV available): https://data.europa.eu/data/datasets/cosmetic-ingredient-database-ingredients-and-fragrance-inventory?locale=en

It roughly follows this structure:

INCI name Chem/IUPAC Name / Description Restriction Function
DIPTEROCARPUS INTRICATUS EXTRACT (Dipterocarpus Intricatus Extract is the extract of the whole plant Dipterocarpus intricatus, Dipterocarpaceae. ANTIOXIDANT, HUMECTANT, SKIN CONDITIONING, SKIN CONDITIONING - EMOLLIENT
1,2-BUTANEDIOL HUMECTANT, SKIN CONDITIONING, SOLVENT, VISCOSITY CONTROLLING

The dataset could be converted to QA / instructional using this format:

Q: What is the <INCI name>?
A: Here is the short description or chemical name of this ingredient: <Chem/IUPAC Name / Description>. It's used as: <Function>

And for rows with empty descriptions:

Q: What is the <INCI name>?
A: I couldn't find a description for this ingredient. However, it's used as: <Function>

What do you think?

nmeln avatar Feb 25 '23 14:02 nmeln