Abdulmalik Alquwayfili
Abdulmalik Alquwayfili
# What does this PR do? Support pre-tokenized datasets in Parquet format and to skip tokenization step if it's already has been done. ## Motivation I needed to use Parquet...
Adds a contribution guide to help contributors understand how to add papers to the repository.
## Added ArabicNLP 2025 papers Added 39 papers from the ArabicNLP 2025 main conference (co-located with EMNLP 2025, Nov 8-9, Suzhou, China). [(ArabicNLP 2025) ](https://arabicnlp2025.sigarab.org/) Suzhou, China Co-located with [EMNLP...