gloritygithub11 issues

Results 12 issues of


                                            gloritygithub11

Fail to build Llama-3-70B-Instruct with w4a16

### System Info tensorrt 10.0.1 tensorrt-cu12 10.0.1 tensorrt-cu12-bindings 10.0.1 tensorrt-cu12-libs 10.0.1 tensorrt-llm 0.10.0.dev2024050700 A100 40G ### Who can help? @byshiue ### Information - [X] The official example scripts - [...

bug

triaged

How to build int4_gptq on Mixtral 8x7b

I use following code to generate the checkpoint: ``` set -e export MODEL_DIR=/mnt/memory export MODEL_NAME=Mixtral-8x7B-Instruct-v0.1 export LD_LIBRARY_PATH=/usr/local/tensorrt/lib:$LD_LIBRARY_PATH export PATH=/usr/local/tensorrt/bin:$PATH export PRECISION=int4_gptq_a16 export QUANTIZE=int4_gptq export DTYPE=bfloat16 export PYTHONPATH=/app/tensorrt-llm:$PYTHONPATH python ../llama/convert_checkpoint.py \...

triaged

Fail to build int4_awq on Mixtral 8x7b

### System Info ubuntu 20.04 tensorrt 10.0.1 tensorrt-cu12 10.0.1 tensorrt-cu12-bindings 10.0.1 tensorrt-cu12-libs 10.0.1 tensorrt-llm 0.10.0.dev2024050700 ### Who can help? @Tracin ### Information - [X] The official example scripts - [...

triaged

feature request

quantization

not a bug

getPluginCreator could not find plugin: WeightOnlyQuantMatmultensorrt_llm

### System Info - A100 40G - tensorrt 10.0.1 - tensorrt-llm 0.10.0.dev2024050700 ### Who can help? @Tracin ### Information - [X] The official example scripts - [ ] My own...

bug

triaged

getPluginCreator could not find plugin: Gemmtensorrt_llm version: 1

### System Info tensorrt 10.0.1 tensorrt-cu12 10.0.1 tensorrt-cu12-bindings 10.0.1 tensorrt-cu12-libs 10.0.1 tensorrt-llm 0.10.0.dev2024050700 ### Who can help? @byshiue ### Information - [X] The official example scripts - [ ] My...

triaged

Is it possible to implement a quantize method like Q2_K in llama.cpp?

Hello, We have some model(qwen2 72b) running in Q2_K with llama.cpp, and we want to migrate to TRT-LLM, is it technically possible? see: https://github.com/ggerganov/llama.cpp/blob/master/examples/quantize/quantize.cpp If yes, what's the effort is...

Mixtral 8x7b failed on compile with tensorrt-llm

config file: ``` base: seed: &seed 42 model: type: Mixtral path: /models/Mixtral-8x7B-Instruct-v0.1 torch_dtype: auto calib: name: pileval download: False path: /app/llmc/tools/data/calib/wikitext2 n_samples: 128 bs: -1 seq_len: 512 preproc: pileval_awq seed:...

LLama3-8B-Instruct fail for TensorRT-LLM

Hello, I'm tring to build with tensorrt, following is the config file: ``` base: seed: &seed 42 model: type: Llama path: /models/Meta-Llama-3-8B-Instruct torch_dtype: auto calib: name: pileval download: False path:...

qwen2 72b output empty after quantize with smoothquart

### System Info tensorrt 10.2.0 tensorrt_llm 0.12.0.dev2024072301 A100-80G * 4 ### Who can help? @Tracin ### Information - [X] The official example scripts - [ ] My own modified scripts...

bug

triaged

stale

fail to start awq quantized model with lightllm on qwen2-7b-instruct

awq config ``` base: seed: &seed 42 model: type: Qwen2 path: /models/Qwen2-7B-Instruct tokenizer_mode: slow torch_dtype: auto calib: name: pileval download: False path: /app/src/llmc/tools/data/calib/pileval n_samples: 128 bs: -1 seq_len: 512 preproc:...