Tan-Hexiang

Results 4 issues of Tan-Hexiang

Thanks for releasing the code! I am trying to run `QuestionAnsweringSquadDiffMaskAnalysis.ipynb`. but I get this error when load_from_checkpoint() `RuntimeError: Error(s) in loading state_dict for BertQuestionAnsweringSquadDiffMask: Missing key(s) in state_dict: "net.bert.embeddings.position_ids"....

Thanks for sharing the code! I have some problem about how to select optimizer when training diffmask. I find that **Lookahead RMSprop** is used in '**How do Decisions Emerge across...

I noticed that the datasets supported in the code are all multiple-choice and classification types, such as IMDB, QNLI, and BoolQ. Can the code in this repository support free-form types...

When trying to merge several checkpoint of olmo2 (olmo2-7b-sft, olmo2-7b-dpo, olmo2-7b-instruct), the mergekit complete the merging process without any errors! But the merged model is **missing some parameters** and will...