TensorRT-LLM Mixtral 8x7B smoothquant failed

p4de (4 80GB A100 GPUs)

@Tracin @byshiue

[X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Install 0.9.0 release 2.python ./llama/convert_checkpoint.py --model_dir ${MODEL_DIR} --output_dir ./ckpt --dtype float16 --tp_size 8 --workers 8 --smoothquant 0.5

Checkpoint created

Traceback (most recent call last):

in convert_layer qkv_weight = qkv_para[prefix + 'self_attn.qkv_proj'] KeyError: 'model.layers.0.self_attn.qkv_proj'

n/a

May 03 '24 22:05 vnkc1

@vnkc1 Mixtral MoE Smoothquant is not currently supported as noted in the precision support matrix

May 10 '24 23:05 kshitizgupta21

Thanks, is there a plan to support MoE Smoothquant?

May 15 '24 16:05 ghost

It is not in our roadmap now. If you are interested, you could create a ticket to request it. Close this issue.

May 24 '24 06:05 byshiue