ParlAI
ParlAI copied to clipboard
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
**Tldr: Add implementations of [FlashAttention](https://arxiv.org/abs/2205.14135) using OpenAI's triton language.** **Background**: - FlashAttention: an IO-aware exact attention algorithm that uses tiling to reduce the number of memory reads/writes, 15% end-to-end speedup...
Hi, I was wondering how can I inference a particular module of the BB3 model? For example, the generate dialogue response module takes in “Full context + knowledge + memory...
Hi, I'm trying to host a chat service with blenderbot 3 using multiple gpus. Here is my config file ``` tasks: default: onboard_world: MessengerBotChatOnboardWorld task_world: MessengerBotChatTaskWorld timeout: 1800 agents_required: 1...
**Patch description** Made a small update to the PACER model plugin that allows for it to be added to other agents (like the `ExpandedAttentionDecoderAndPacerAgent`) without doubly loading the base model...
**Patch description** This PR adds the ability to log model artifacts to [Weights & Biases](https://wandb.ai/site). Logging model artifacts allows complete reproducibility of the experiments that resulted in the trained model...
Hi I am trying to run Blenderbot 3 on Ubuntu 20.04 tried with RTX 2080Ti and RTX 3080Ti and it worked exactly once (one turn) after which it keeps giving...
**Bug description** Please enter a clear and concise description of what the bug is. I see this : https://parl.ai/projects/bb3/ When I run this ```parlai interactive --init-opt gen/opt_bb3 --opt-server API_SERVER --loglevel...
Hey I was wondering what the set of commands are to get parlai to train t5 or GPT2 to train with seeker parameters where the search query is the copy...
**Patch description** Previously T5 dictionary was undefined and elements of the huggingface dictionary agent were uninitialized leading to immediate exceptions in the train step for T5 and at the beginning...
**Patch description** Since the support for setting seed in `train_model.py` has been added, we can let user fix the seed in dataset sampler also. **Testing steps** **Other information**