TensorRT-LLM
TensorRT-LLM copied to clipboard
[Fix] Explicitly check if output['generation_logits'] is an empty list
When running a GenerationSession with gather_generation_logits and the first token that is generated has end_id then outputs['generation_logits'] is an empty list. This crashes the session in _prepare_outputs https://github.com/NVIDIA/TensorRT-LLM/blob/3d56a445e8ebf888e78be638faf6beec0a78f3c2/tensorrt_llm/runtime/model_runner.py#L253 with
RuntimeError: stack expects a non-empty TensorList
This PR adds an explicit check and leaves the empty list unaltered. Alternatively one could also delete the dict entry or return an empty tensor in this edge case.