Fast-LLM
Fast-LLM copied to clipboard
[bug] Conversion fails when using `layers_per_step` with some input formats
🐞 Describe the Bug
Conversion fails when using layers_per_step together with input_format=fast_llm
example job: 7ada4a96-4b5d-43de-a156-ebea5f359a33
Global counter mismatch for parameter "layers.8.norm_1.weight" and shard "weights": 0 != 2048
[...]
Global counter mismatch for parameter "layers.17.output_weights" and shard "weights": 0 != 268435456
🔄 Steps to Reproduce
Convert a model exported in fast_llm format, using the layers_per_step argument
fast-llm convert gpt \
input.path=exp_dir/export/fast_llm/20000 \
input.format=fast_llm \
output.path=exp_dir/export/mixtral/20000 \
output.format=mixtral \
use_cpu=False \
exist_ok=True \
layers_per_step=8
🎯 Expected Behavior
Conversion succeeds
Hi, is this still a problem?