DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

Model saved from deepspeed and accelerate cannot be loaded or incomeplete

Open manitadayon opened this issue 5 months ago • 1 comments

I train/fine-tune the Mistral-24B small model (mistralai/Mistral-Small-24B-Instruct-2501) using deepspeed and accelerate and I saved the model using following commands:

if accelerator.is_main_process:
      model =accelerator.unwrap_model(trainer.model)
      model.save_pretrained(model_path)
      tokenizer.save_pretrained(model_path)

However when I try to load the model using AutoModelForCausalLM.from_pretrained(model_path) I got weight mismatch size error.

This is how the folder look like can

folder/
         chat_template.jinja
         config.json
         generate_config.json
         model.safetensors
         special_tokens_map.json
         tokenizer_config.json
         tokenizer.json

Can anyone tell me what is going on here and how to solve the problem.

manitadayon avatar Aug 15 '25 01:08 manitadayon

Hey, are u able to get this issue resolved i am facing the same issue.

harsha2225 avatar Sep 25 '25 22:09 harsha2225