DeepSpeed
DeepSpeed copied to clipboard
Model saved from deepspeed and accelerate cannot be loaded or incomeplete
I train/fine-tune the Mistral-24B small model (mistralai/Mistral-Small-24B-Instruct-2501) using deepspeed and accelerate and I saved the model using following commands:
if accelerator.is_main_process:
model =accelerator.unwrap_model(trainer.model)
model.save_pretrained(model_path)
tokenizer.save_pretrained(model_path)
However when I try to load the model using
AutoModelForCausalLM.from_pretrained(model_path)
I got weight mismatch size error.
This is how the folder look like can
folder/
chat_template.jinja
config.json
generate_config.json
model.safetensors
special_tokens_map.json
tokenizer_config.json
tokenizer.json
Can anyone tell me what is going on here and how to solve the problem.
Hey, are u able to get this issue resolved i am facing the same issue.