Daneshwari K.

Results 1 issues of Daneshwari K.

Hi! Kudos to the author for an end-to-end piepline for cleaning and filtering a large corpus. I was working with [main_filtering.py](https://github.com/bigscience-workshop/data-preparation/blob/main/preprocessing/training/01b_oscar_cleaning_and_filtering/main_filtering.py) and was trying to change the parameter values in...