hector

Results 5 comments of hector

I want to know where to download the pre-trained model. Additionally, could you provide the model of Aristorobertav7 you used?The model downloaded on the HuggingFace is difficult to achieve the...

Thanks for your reply! I found that only Multilingual-E5-base model is provided on HuggingFace. Whether the Multilingual-E5-large version has been open? If so, could you please provide me with the...

> Hi, have you solved the issue? I have met the exact same problem. 我也刚在尝试,发现只要用_prepare_deepspeed载入ref_model情况下,就会报错,开始zero2/zero3都尝试过 : (

> 可能就是offload问题,我用stage3 no offoad就可以了。 > > ``` > deepspeed --master_port 25002 --include "localhost:4,5,6,7" src/train.py \ > --model_name_or_path ${model_path} \ > --stage 'dpo' \ > --do_train \ > --finetuning_type 'full' \...

可以去仔细看下代码,我的理解是如果create_new_adapter为True,则会在代码中将adapter_name_or_path与基模先merge起来,然后再添加新的adapter进行训练,反之create_new_adapter为False,则是在adapter_name_or_path基础上进一步训练,最后保存优化后的adapter,两种方法的本质效果是一致的,但第一种会产生两个adapter参数,第二种就只有一个adapter参数,不知道我的理解是否正确 : )