Gunsoo Han
Gunsoo Han
Trying the below command gives another error ` parlai eval_model -mf zoo:blenderbot2/blenderbot2_400M/model -t msc -v --knowledge_access_method memory_only `  Note that i added ` --knowledge_access_method memory_only` argument which is a...
After some debugging, it turns out that modifying `--rag_retriever_type` argument disables search. I set it to `dpr` and run following command - which is working at the moment. ```bash parlai...
Having said that, I still want to know the exact eval script to reproduce the reported ppl [here](https://github.com/facebookresearch/ParlAI/blob/main/parlai/zoo/blenderbot2/model_card.md#metrics-used-and-evaluation-results) in the table **Metrics Used and Evaluation Results** for both **Session 4...
@klshuster Thank you for reply. Im evaluating the 2.7B model with and without `--include_last_session True` argument, and the results are different in terms of validation ppl. Could you pleas example...
@klshuster  This is the result of the following comand ```bash parlai eval_model -mf zoo:blenderbot2/blenderbot2_3B/model -t msc -v --knowledge-access-method none --rag_retriever_type dpr --log_every_n_secs 60 --batchsize 32 ``` You can see...
@klshuster Thank you for kind reply. After evaluating with `--knowledge-access-method memory_only`, I obtained following results - still showed some gap with respect to the reported value of **9.8463**. ``` msc_dialogue_4/exs:5904...
no. I aslo tried `--batchsize 32` and faced the same error. ```bash parlai eval_model -mf zoo:blenderbot2/blenderbot2_3B/model -t msc -v --knowledge-access-method none --rag_retriever_type dpr --log_every_n_secs 60 --batchsize 32 ```  This...
Ok, thank you !
I have faced the same repeating issues when training korean model. After some amount of research, I have found that this is a general problem for natural language generation and...
@Colanim Sorry for late reply I have replace the decoder with the following [method](https://arxiv.org/pdf/1904.09751.pdf) which proposes a new way of sampling tokens at decoding steps, rather than just depending on...