Open-Assistant
Open-Assistant copied to clipboard
soda_synthetic_dialogue removed from GDrive
The original prepared SODA synthetic dialogue dataset was removed from https://drive.google.com/uc?id=1TOGQfr419n8wpzJpYLLw4nB3tSKD8zXV (referenced here)
https://github.com/LAION-AI/Open-Assistant/tree/ada91f1c37b793ff19b0d0f0197d59aa019a4375/data/datasets/soda_synthetic_dialogue contains the code to prepare it. Ideally it should be hosted in a HuggingFace datasets repo to prevent this
https://huggingface.co/datasets/emozilla/soda_synthetic_dialogue appears to be the same exact dataset, let's reference it instead