Zehan Li issues

Results 37 issues of


                                            Zehan Li

Bug: can't co-exist with pytorch-lightning

I'm trying to train a T5 model with `transformers` library, which requires the `sentencepiece` library to tokenize sentence. But when I installed it with `pip install sentencepiece`, I can't import...

On the dependency `faiss`

Hi, I noticed that your `beir==0.2.3` depends on `faiss-cpu`. While in your NeurIPS'2021 paper, you benchmarked several dense retrieval models on GPU. Did you use `faiss-gpu` for that? Have you...

ValueError when running `evaluate_bm25.py`

Hi, I was trying to run your `evaluate_bm25.py` baseline, but I got the following error. There may be some problem with `elasticsearch`. Could you please help me fix it? ```...

Can't replicate results of BBTv2 paper

Hi, I tried your BBTv2 code but failed to get comparable results as reported in your paper. In my case, using the command ```Python python deepbbt.py --model_name "roberta-large" --task_name "snli"...

TypeError: 'EpochBatchIterator' object is not iterable

## 🐛 Bug When I walks through your [docs](https://fairseq.readthedocs.io/en/latest/tasks.html), I find that the following code raises `TypeError: 'EpochBatchIterator' object is not iterable` ``` # setup the task (e.g., load dictionaries)...

bug

needs triage

What is Colbert v1.9?

Hi, I'm a little confused about the version. Is this an intermediate checkpoint? How is it trained? What is its difference with respect to v1 and v2? Is training data...

empty answer field in squad dataset

Hi, I noticed that the squad dataset on [hf](https://huggingface.co/datasets/Tevatron/wikipedia-squad) has an empty answers field for all instances. Maybe there is a problem during data processing?

How to load tokenizer trained by sentencepiece or tiktoken

Hi, does this lib supports loading pre-trained tokenizer trained by other libs, like `sentencepiece` and `tiktoken`? Many models on hf hub store tokenizer in these formats

Stale

planned

Bug in yaml parsing

Hi, when I'm using a custom m_mmlu task, there is an error like this ``` Generating test split: 12909 examples [00:00, 512686.14 examples/s] Traceback (most recent call last): File "/output/lm-evaluation-harness/lm_eval/__main__.py",...

Fix m_arc choices

This is a 4-choice task, option_e is null for all but 3 samples. And it is never used in gold answer