Ken Tsui
Ken Tsui
This issue is to explore the possibility and necessity of building QA dataset based on WikiHow ## Existing dataset built on WikiHow: Summarisation: https://arxiv.org/pdf/1810.09305.pdf Commonsense: https://arxiv.org/abs/1905.07830 Subset of QA: https://huggingface.co/datasets?search=wikihow...
Closes #322 Factor out fixed seed data by adding DEBUG_USE_SEED_DATA_PATH in config to control seed data to use
Added more documents and papers n the retrieval direction.
Integrate MDEL with various evaluation framework - [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) - [helm](https://github.com/stanford-crfm/helm)
If most training script is homogenous except the data_path args/ config (I assume it is as they started from the same seed LM), then we could do a script that...