WHDY issues

Repositories
Issues
Comments

Results 1 issues of


                                            WHDY

add tensorRT model worker

## Why are these changes needed? [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) can greatly improve the inference speed of LLM. It would be helpful to support tensorRT-LLM in Fastchat. This commit simply implements how to...