can i add images embedding to llm input? How can i do it？

Open Onwaydbh opened this issue 1 year ago • 1 comments

such as，i want to use a Visual Pretrained Language Models to take the image embedding,and add it to llm input to get the output

Jul 30 '24 15:07 Onwaydbh

Same problem

Aug 05 '24 07:08 Popsicle0-0

We support several popular multimodal models in examples/multimodal/.

For these models, we pass image embedding input to LLM via prompt_table argument (this extends the embedding table of LLM) and modify input_ids with indices into prompt_table.

You can check tensorrt_llm/runtime/multimodal_model_runner.py for how this mechanism is used for different models.

Sep 04 '24 20:09 amukkara

您发给我的信件已收到

Sep 04 '24 20:09 Onwaydbh

您发给我的信件已收到

Nov 14 '24 02:11 Onwaydbh