Ogundepo Odunayo

Results 15 issues of Ogundepo Odunayo

I tried using the tokenizer visualizer but it doesn't seem to work when I load the tokenizer using `AutoTokenizer.from_pretrained()`. Here's the error I'm getting below: ``` --------------------------------------------------------------------------- AttributeError Traceback (most...

Stale

Added support for Yoruba Language Language Code = 'yo ![11507 Danpascu jẹ plánẹ tì kékeré ní ibi ìgbàjá ástẹ rọ ìdì_0](https://user-images.githubusercontent.com/38908008/136361842-28c456ca-93c0-40bb-9716-8895fb3ba337.jpg) ![Àyọkà yìí tàbí apá rẹ únfẹ àtúnṣe sí_1](https://user-images.githubusercontent.com/38908008/136361844-f46f01ce-adfe-4678-9595-b04dcd24150e.jpg) '

Hi @luyug, any idea on how to fix this? 04/14/2022 15:48:04 - INFO - tevatron.trainer - Initializing Gradient Cache Trainer Traceback (most recent call last): File "/home/odunayo/anaconda3/envs/tevatron_env/lib/python3.9/runpy.py", line 197, in...

- Updated requirements to use a more recent version of pygaggle and pyserini. - The existing version of pyserini in the code cannot load Lucene indexes from the current Anserini...

The file [convert_trec_run_to_dpr_retrieval_run.py](https://github.com/castorini/pyserini/blob/master/pyserini/eval/convert_trec_run_to_dpr_retrieval_run.py) only allows the conversion of topics currently checked into anserini. I guess we can open this up to use custom query files also? https://github.com/castorini/pyserini/blob/2673031f6b202941fe0f9953c9b876e6d4f1e653/pyserini/eval/convert_trec_run_to_dpr_retrieval_run.py#L26-L37 I can see...

Initial PR Based on https://github.com/castorini/pyserini/issues/1375 Modularize imports so that LuceneSearcher does not rely on Faiss, torch, and transformers

Could probably do with some redesign but here's a first pass at integrating MLX into Pyserini with ColBERT. The tests and test outputs are similar to the same tests for...

Possible preprocessing feature: Preprocess unstructured text into passages possibly using Pygaggle segmentation https://github.com/castorini/pygaggle/blob/master/pygaggle/data/segmentation.py

Related issue relevant to Spacerini: https://github.com/castorini/pyserini/issues/1449