ParlAI Missing Tasks

This is a list of tasks not yet in ParlAI that would be great to have. Feel free to add more to the list also! We will remove individual items when they are done.

Chit Chat

[x] DailyDialog https://arxiv.org/abs/1710.03957
[x] Datasets in decaNLP that are missing: https://github.com/salesforce/decaNLP
[ ] CoLA https://nyu-mll.github.io/CoLA/
[ ] Movie Discussions with Knowledge: https://arxiv.org/pdf/1809.08205.pdf
[x] MultiWoz : https://arxiv.org/abs/1810.00278
[ ] Video stories?: https://research.fb.com/wp-content/uploads/2018/10/A-Dataset-for-Telling-the-Stories-of-Social-Media-Videos.pdf?
[x] AirDialogue http://www.aclweb.org/anthology/D18-1419
[ ] Movie chat with background knowledge: http://aclweb.org/anthology/D18-1255,
[x] Movie chat with Wikipedia grounding: http://aclweb.org/anthology/D18-1076, https://github.com/festvox/datasets-CMU_DoG
[ ] Craiglist bargain http://aclweb.org/anthology/D18-1256
[ ] Datasets from DSTC7 http://alborz-geramifard.com/workshops/nips18-Conversational-AI/Papers/18convai-DSTC7.pdf
[ ] Movie recommendation: https://www.microsoft.com/en-us/research/uploads/prod/2018/11/deep_conversational_recommendations__1_1.pdf
[x] Redial dataset: https://redialdata.github.io/website/
[ ] OTTers https://arxiv.org/pdf/2105.13710.pdf

Knowledge-grounded datasets:

[ ] Conversational reading (https://arxiv.org/pdf/1906.02738.pdf)
[ ] Knowledge Dataset from DSTC7 https://github.com/DSTC-MSR-NLP/DSTC7-End-to-End-Conversation-Modeling/tree/master/data_extraction
[x] Holl-E (https://github.com/nikitacs16/Holl-E, https://arxiv.org/abs/1809.08205)
[ ] OpenDialKG (https://github.com/facebookresearch/opendialkg)

Visual Dialogue / QA Tasks / Captioning:

[ ] KVQA http://dosa.cds.iisc.ac.in/kvqa-2/01/mishra_CR.pdf (see paper for links to other VQA too)
[ ] GQA (VQA-type) dataset https://cs.stanford.edu/people/dorarad/gqa/
[ ] Visual Storytelling https://arxiv.org/pdf/1604.03968.pdf
[ ] Multimodal shopping dialogue (with images) https://arxiv.org/pdf/1704.00200.pdf
[ ] Visual Commonsense reasoning https://visualcommonsense.com/
[ ] Netizen-Style Commenting on Fashion Photos, https://mashyu.github.io/NSC/
[ ] Conceptual Captions https://github.com/google-research-datasets/conceptual-captions

QA Tasks:

[x] Natural Questions: https://ai.google/research/pubs/pub47761
[x] HotpotQA: https://hotpotqa.github.io/
[ ] SearchQA https://github.com/nyu-dl/SearchQA
[ ] Who Did What (cloze qa): https://arxiv.org/abs/1608.05457
[ ] NewsQA https://datasets.maluuba.com/NewsQA
[x] QuAC: https://arxiv.org/pdf/1808.07036.pdf
[x] CoQA (#1674): https://arxiv.org/abs/1808.07042
[x] DREAM Dialogue QA https://arxiv.org/pdf/1902.00164.pdf
[ ] DROP https://arxiv.org/abs/1903.00161
[ ] Common Sense from ConceptNet https://arxiv.org/pdf/1811.00937.pdf
[x] AmazonQA: http://jmcauley.ucsd.edu/data/amazon/qa/

Jan 10 '18 01:01 jaseweston

@jaseweston How about AQuA dataset by deepmind for algebraic questions? https://github.com/deepmind/AQuA I can add support for this if it is acceptable.

I can possibly work on others too.

Jan 27 '18 18:01 apsdehal

yes, sure -- it would be great if you can add it (and any others you want)!

Jan 28 '18 01:01 jaseweston

@dfcf93 sure it would be great if you can add it!

Jun 07 '18 19:06 jaseweston

I would add HotpotQA https://hotpotqa.github.io/, which my group are currently working on getting into ParlAI

Oct 18 '18 08:10 BenjaminWinter

I would add HotpotQA https://hotpotqa.github.io/, which my group are currently working on getting into ParlAI

awesome!

Oct 18 '18 10:10 jaseweston

I am commenting here to let interested people know that I have a working DailyDialog implementation running on this fork: https://github.com/Mrpatekful/ParlAI/tree/dialogwae.

I will submit a PR as soon as I get around to clean the code and run some tests. I have to note though that in my implementation I only use the chat dataset without any annotations, so I guess it's not a full implementation of DailyDialog, but I would be happy to collaborate if someone were up to it. I did implement two types of tasks though: a single-turn task, and a task where all dialog history is used, for agents that can handle multiple utterances.

Nov 30 '18 12:11 ricsinaruto

There's a large new grounded dialogue dataset that came out last month: Conversing by Reading: Contentful Neural Conversation with On-demand Machine Reading https://arxiv.org/pdf/1906.02738.pdf Might be useful?

Jul 26 '19 20:07 abisee

Yes, it was already in our list..

On Fri, Jul 26, 2019 at 10:46 PM Abi See [email protected] wrote:

There's a large new grounded dialogue dataset that came out last month: Conversing by Reading: Contentful Neural Conversation with On-demand Machine Reading https://arxiv.org/pdf/1906.02738.pdf Might be useful?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/facebookresearch/ParlAI/issues/492?email_source=notifications&email_token=ACUOJ6B3JC5MI26UB7TTOU3QBNPDNA5CNFSM4ELCZIVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD25VQQI#issuecomment-515594305, or mute the thread https://github.com/notifications/unsubscribe-auth/ACUOJ6F4YODDCO5NZHU6MNTQBNPDNANCNFSM4ELCZIVA .

Jul 28 '19 00:07 jaseweston

This issue has not had activity in 30 days. Please feel free to reopen if you have more issues. You may apply the "never-stale" tag to prevent this from happening.

Jun 21 '20 00:06 github-actions[bot]

Will work on

[ ] Conversational reading (https://arxiv.org/pdf/1906.02738.pdf)

Any help or suggestion of a better model welcome @abisee

Feb 16 '22 15:02 vlordier