LongRAG issues

Question about process_wiki_page.py

1

Hello, I downloaded the wiki raw dataset you previously mentioned and ran process_wiki_page.py with the following command: `python process_wiki_page.py --dir_path './bz_file' --output_path_dir './result' --corpus_title_path './psgs_w100.tsv'` The bz_file directory contains the...

gw16

Could not reproduce the answer recall for NQ dataset

4

Hi, I load nq/full-00000-of-00001.parquet and compute the answer recall based on answers, context = item["answer"], item["context"] is_retrieval = has_correct_answer(context, answers) I could only get 0.8532 answer recall, which is below...

tyu008

Does it support testing a single PDF file?

I understand that LongRAG extracts articles from Wikipedia XML dump files and stores them in multiple files, each of which contains multiple documents in XML or JSON format. LongRAG splits...

zhoumengbo

Bad performance for running run_retrieve_tevatron.sh

Hi, I try to build the index of the wiki corpus using the script you provide in `scripts/run_retrieve_tevatron.sh`. However, I find the performance of retrieval evaluation is very bad. The...

acphile

LongRAG
LongRAG copied to clipboard

Metadata

Question about process_wiki_page.py

Could not reproduce the answer recall for NQ dataset

Does it support testing a single PDF file?

Bad performance for running run_retrieve_tevatron.sh

← Metadata

Owner

Metadata

LongRAG LongRAG copied to clipboard

Metadata

Question about process_wiki_page.py

Could not reproduce the answer recall for NQ dataset

Does it support testing a single PDF file?

Bad performance for running run_retrieve_tevatron.sh

← Metadata

Owner

Metadata

LongRAG
LongRAG copied to clipboard