TensorRT-LLM
TensorRT-LLM copied to clipboard
[ModelRunner] Fix stop & bad word list pointer offset.
In regression tests of the ModelRunnerCpp vs ModelRunner we noticed that the stop_words_list feature does not work properly for the ModelRunner and batch_size > 1. Depending on the input we get incorrect results or Cuda Runtime Errors (misaligned address). Looking at the relevant code, it seems that the pointer offsets for stop_words_list_ptrs (and also bad_words_list_ptrs) don't take into account the size of the tensor entries. This PR fixes this issue by multiplying the offset by the element size.