[bug] Conversion fails when using `layers_per_step` with some input formats

Open RaymondLi0 opened this issue 1 year ago • 1 comments

🐞 Describe the Bug

Conversion fails when using layers_per_step together with input_format=fast_llm example job: 7ada4a96-4b5d-43de-a156-ebea5f359a33

Global counter mismatch for parameter "layers.8.norm_1.weight" and shard "weights": 0 != 2048
[...]
Global counter mismatch for parameter "layers.17.output_weights" and shard "weights": 0 != 268435456

🔄 Steps to Reproduce

Convert a model exported in fast_llm format, using the layers_per_step argument

fast-llm convert gpt \
input.path=exp_dir/export/fast_llm/20000 \
input.format=fast_llm \
output.path=exp_dir/export/mixtral/20000 \
output.format=mixtral \
use_cpu=False \
exist_ok=True \
layers_per_step=8

🎯 Expected Behavior

Conversion succeeds

Dec 04 '24 23:12 RaymondLi0

Hi, is this still a problem?

Jul 16 '25 22:07 jlamypoirier