Sean MacAvaney
Sean MacAvaney
**Dataset Information:** To appear at SIGIR 2022. Is there a more succinct name for this dataset? **Links to Resources:** - [Paper](https://arxiv.org/pdf/2205.11685.pdf) - [Repo](https://github.com/SIGIR-2022/A-Dataset-for-Sentence-Retrieval-for-Open-Ended-Dialogues) **Dataset ID(s) & supported entities:** TBD **Checklist**...
regarding #189 Took the opportunity to improve the tests here for the variety of formats, etc. that TrecDocs may encounter. @ArthurCamara -- mind running `python -m tests.integration.disks45` when using your...
**Dataset Information:** A Chinese question answering dataset. **Links to Resources:** - Repo: https://github.com/baidu/DuReader - Paper: https://arxiv.org/abs/2203.10232 **Dataset ID(s) & supported entities:** - TBD **Checklist** Mark each task once completed. All...
**Dataset Information:** "WANDS is a Wayfair product search relevance dataset." **Links to Resources:** - https://github.com/wayfair/WANDS - https://easychair.org/publications/preprint_download/j2D4 **Dataset ID(s) & supported entities:** - `wands` (docs, queries, qrels) **Checklist** Mark each...
Right now, the `ir-datasets.bib` file is a bit messy, with inconsistencies in the ids/fields/formatting/etc. across records. It's probably best to go with an established source, such as DBLP, the ACL...
**Dataset Information:** An Urdu test collection. **Links to Resources:** - https://arxiv.org/pdf/2011.00565.pdf **Dataset ID(s) & supported entities:** - `cure` **Checklist** Mark each task once completed. All should be checked prior to...
**Dataset Information:** "The main task for the proposed track is ad-hoc cross-language retrieval. Documents will be drawn from Common Crawl newswire, and will be written in Chinese, Russian, and Persian....
**Is your feature request related to a problem? Please describe.** @andrewyates points out these two use cases when working with docs: 1. do something sane for the situation where you...
Allows defining datasets locally
**Dataset Information:** "The Health Misinformation track aims to (1) provide a venue for research on retrieval methods that promote better decision making with search engines, and (2) develop new online...