Sun-Shiqi comments

Repositories
Issues
Comments

Results 2 comments of


                                            Sun-Shiqi

RLHF problems when using Qwen model

I load QW like this : create_hf_model( model_class=AutoModelForCausalLM, model_name_or_path=actor_model_name_or_path, tokenizer=self.tokenizer, ds_config=ds_config, dropout=self.args.actor_dropout) without any error. Maybe the version of transformers or deepspeed is not right.

The reward value did not increase.

![下载](https://github.com/microsoft/DeepSpeedExamples/assets/120917599/cff4bef4-9c2e-4d2b-adbe-c01fc93604a4) this is my reward curve