AllenShow
AllenShow
YES! There is no README in process_data as mentioned here data_creation/critic/gpt4_reward/README.md
And the same question for the parameter 'max_new_tokens', should it be set to the same value as the self-RAG's setting in each particular task for compare?
I also want to know the answer and a complete description of the data creation process.
请问你用的数据集和aime2024_gen不带版本号 默认指向的aime2024_gen_6e39a4有啥区别,评测chat模型应该用哪个呢?
> GPQA_gen points to gpqa_openai_simple_evals_gen_5aeece.py, which requires the model outputs “ANSWER: $LETTER”. 所以如果想要更通用一点地对chat模型进行评测,可以用gpqa_gen_4baadb是吗?
> Please check the prediction to find if there is a repeat pattern in response. And reduce the --max-out-len. Also, you can remove the --debug and use four workers. Could...