plt12138
plt12138
### System Info - CPU:4090 * 4 - TensorRT-LLm : v0.8.0 - CUDA Version: 12.3 - NVIDIA-SMI 545.29.06 ### Who can help? _No response_ ### Information - [X] The official...
### System Info TensorRT-LLM:v0.9.0 tensorrtllm_backend:v0.9.0 ### Who can help? @kaiyux ### Information - [ ] The official example scripts - [ ] My own modified scripts ### Tasks - [...
I want to build the Mistral model using a AWQ and BF16. ``` python3 ../quantization/quantize.py --model_dir dolphin-2.6-mistral-7b-sft-yhy \ --dtype bfloat16 --qformat int4_awq --awq_block_size 128 \ --output_dir ./quantized_int4-awq-bf16 --calib_size 32 ```...