Paul Hendricks

Results 19 issues of Paul Hendricks

### System Info ## System Info - CPU architecture: x86_64 - Host Memory: 1TB - GPU: NVIDIA A100 80GB x8 - TensorRT-LLM version: v0.18.2 - Triton container: `nvcr.io/nvidia/tritonserver:25.04-trtllm-python-py3` - CUDA:...

bug

bug: changing top_logprobs from u32 to u8 to be consistent with chat