Nicolas HERVE
Nicolas HERVE
When retrieving tweets with the full archive endpoint, asking for all expansions, I have the following error. It does not appear immediately, I'm able to retrieve a bunch of tweets...
The file name should be coherent with the class it contains (upper / lower case)
https://github.com/huggingface/datatrove/blob/1e27cc8819465d5246d89cd929423b76eb0bc5dd/src/datatrove/pipeline/dedup/minhash.py#L196
### Describe the bug Hi, a issue related to #4760 here when loading a single file from a dataset, unable to access it in offline mode afterwards ### Steps to...
Is it possible to have the Replace function support the group capture in the replacement string ? In the following dummy example, I want to add a space between letters...