Sun-Shiqi

Results 2 comments of Sun-Shiqi

I load QW like this : create_hf_model( model_class=AutoModelForCausalLM, model_name_or_path=actor_model_name_or_path, tokenizer=self.tokenizer, ds_config=ds_config, dropout=self.args.actor_dropout) without any error. Maybe the version of transformers or deepspeed is not right.

![下载](https://github.com/microsoft/DeepSpeedExamples/assets/120917599/cff4bef4-9c2e-4d2b-adbe-c01fc93604a4) this is my reward curve