Manel ALOUI

Results 4 comments of Manel ALOUI

Hi see this new release from hf [datatrove](https://github.com/huggingface/datatrove) DataTrove is a library to process, filter and deduplicate text data at a very large scale. It provides a set of prebuilt...

Hi @StephennFernandes I'm also trying to pretrain an llm, and need to do deduplication for my dataset, which method you applied please?

Thanks a lot Sure I will check it @StephennFernandes

Hi, where can I find in the documentation a guide to integrate grpc with openllm?