Jun comments

Results 4 comments of

Jun

How to load multiple Lora weights and multiple text inputs to inference？

> > > The core feature is supported, but we don't have checkpoint to demonstrate. You could modify the lora_manager to load multiple lora weights. > > > > >...

How to load multiple Lora weights and multiple text inputs to inference？

> They share same base model. We have an example here https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/llama#run-llama-with-several-lora-checkpoints. Thanks @byshiue !! In the example [1], `build` script only specify `one` `hf_lora_dir` . Should this be 2...

Blank output in the Inference while using a customize trained T5 model

Hello @jonathlela https://github.com/openai/triton/pull/1306 as the PR is merged, would using most recent openai triton with kernl resolve this issue?

Blank output in the Inference while using a customize trained T5 model

Hello @jonathlela Would there be other large models we can try with Kernl? It seems like larger version of T5 model type does not work due to this issue. Also...