zengrh3 comments

Repositories
Issues
Comments

Results 2 comments of


                                            zengrh3

Does TensorRT-LLM support passing input_embeds directly？

@Oldpan @qism I met the same question as well while in my own Llama design. I passed the `eos_id` to the `runner.generate` function, but it still generates the token until...

Do vLLM support `input_embeds` as input while using LLama?

@DarkLight1337 Hi, I noticed lots of progress in adding `input_embeds` in vLLM. So is that ready or not?