xtuner
xtuner copied to clipboard
[feat] use HF state_dict in RL update weights
We use load_spec generator in BaseModel to generate severl state_dict ( device tensor ) for RT update weights instead of the old layer by layer method. It's a more generally way to support more models.
@HAOCHENYE @HIT-cwh Please review xtuner/v1/model/base.py carefully, as the change here may break HF saving. (Although I think this PR would not break)