beir icon indicating copy to clipboard operation
beir copied to clipboard

Possibility of including TREC DL 2019 and TREC DL 2020

Open liyongkang123 opened this issue 8 months ago • 2 comments

Hi, As the BEIR library sees wider adoption, I’m wondering if it would be possible to add more datasets to better support it. A typical candidate would be TREC DL 2019 and TREC DL 2020. Since we already have MS MARCO, integrating these two datasets shouldn’t be too difficult.

I greatly appreciate your help. If you don’t have time, maybe I could process them into the same format, and then you could just upload them to the website.

Many thanks.

liyongkang123 avatar Aug 12 '25 18:08 liyongkang123

I’ve created a code repository to deal with this.

Anyone who needs it can use it.

liyongkang123 avatar Aug 22 '25 13:08 liyongkang123

Hi @liyongkang123, the TREC-DL 2019 is present in the msmarco dataset under the test split.

Regards, Nandan

thakur-nandan avatar Aug 22 '25 14:08 thakur-nandan