TensorRT-LLM
TensorRT-LLM copied to clipboard
[feature request] Is it possible to make BERT support variable-seqence length inputs?