TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

Mixtral 8x7B smoothquant failed

Open vnkc1 opened this issue 1 year ago • 2 comments

System Info

p4de (4 80GB A100 GPUs)

Who can help?

@Tracin @byshiue

Information

  • [X] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

  1. Install 0.9.0 release 2.python ./llama/convert_checkpoint.py --model_dir ${MODEL_DIR} --output_dir ./ckpt --dtype float16 --tp_size 8 --workers 8 --smoothquant 0.5

Expected behavior

Checkpoint created

actual behavior

Traceback (most recent call last):

in convert_layer qkv_weight = qkv_para[prefix + 'self_attn.qkv_proj'] KeyError: 'model.layers.0.self_attn.qkv_proj'

additional notes

n/a

vnkc1 avatar May 03 '24 22:05 vnkc1

@vnkc1 Mixtral MoE Smoothquant is not currently supported as noted in the precision support matrix

kshitizgupta21 avatar May 10 '24 23:05 kshitizgupta21

Thanks, is there a plan to support MoE Smoothquant?

ghost avatar May 15 '24 16:05 ghost

It is not in our roadmap now. If you are interested, you could create a ticket to request it. Close this issue.

byshiue avatar May 24 '24 06:05 byshiue