dwq370

Results 6 issues of dwq370

### Is there an existing issue for this? - [X] I have searched the existing issues ### Current Behavior ![image](https://user-images.githubusercontent.com/131581396/233827809-ee659d68-0b97-43c8-8ab5-ec7909ae4ff8.png) ### Expected Behavior _No response_ ### Steps To Reproduce 1、用自己的数据微调INT4模型...

3090显卡,CUDA11.1版本,单卡运行INT4推理报错 ![image](https://user-images.githubusercontent.com/131581396/233912435-908fec26-0c3c-428d-bddb-9375d16c567a.png) 是CUDA版本的问题吗?MOSS最低CUDA版本是哪个?

### System Info 4*NVIDIA L20 ### Who can help? _No response_ ### Information - [X] The official example scripts - [ ] My own modified scripts ### Tasks - [...

bug

### System Info NVIDIA 2*L20 launch triton server with tensorrt-llm backend v0.12.0 in a container ### Who can help? _No response_ ### Information - [ ] The official example scripts...

bug

### System Info Ubuntu 22.04 Triton image: nvcr.io/nvidia/tritonserver:24.06-trtllm-python-py3 and the version of trtllm-backend is 0.10.0 Model: qwen2-7b-instruct ### Who can help? _No response_ ### Information - [ ] The official...

bug

step1: launch container ``` mkdir -p ~/nano-test docker run --gpus all --net=host --privileged -v /dev/shm:/dev/shm --name nanoflow -v ~/nano-test:/code -it nvcr.io/nvidia/nvhpc:23.11-devel-cuda_multi-ubuntu22.04 ``` step2: Install dependencies ``` git clone https://github.com/efeslab/Nanoflow.git cd...