shangshng
shangshng
> There are actually quite a few non-trivial changes required. I will make a pull request soon with the changes. Right now just validating my changes on a 2 GPU...
> @Marks101, the logits processor is supported on `ModelRunnerCppExecutor`: https://github.com/NVIDIA/TensorRT-LLM/blob/main/tensorrt_llm/runtime/model_runner_cpp.py#L48 Could you try that please? Hello, it looks like logits processor is disabled here.... 
Same here. And setting n>1 in sample parameter will also get really low throughput.