李振斌
李振斌
@NiYueLiuFeng Can you share your `modeling_internvl_chat.py`? Thank you very much
@irexyc Thank you for your reply. this is `profile_restful_api_image.py` I'll try `--vision-max-batch-size ` later ```python import csv import json import random import time from queue import Queue from threading import...
@irexyc This is how I feel after using '--vision-max-batch-size' as I would have without it ```shell lmdeploy serve api_server --cache-max-entry-count 0.6 /home/notebook/data/personal/W9088934/InternVL2-26B/ --server-port 23333 --vision-max-batch-size 8 python profile_restful_api_image.py http://127.0.0.1:23333 /home/notebook/data/personal/W9088934/InternVL2-26B...
@irexyc According to your comments I modified the test script and added it using the startup service `--log-level INFO` ```python import asyncio import base64 import csv import json import random...
@irexyc Do you have plans to optimize this problem in the future
@irexyc Okay, thanks for your reply. I'll try the `TensorRT` deployment vision model.
@irexyc 我如果分离视觉模型和语言模型,我应该如何把视觉模型的输出加入到语言模型的prompt中,我查看相关原始代码,input_embeddings input_embedding_ranges 是有关图片features的,我应该如何把 这些信息通过 openai.client 加入到请求中
没有转换好呢,我现在在测试,测试过程中,我想到这个问题,看了下lmdeploy 相关源代码,input_embeddings input_embedding_ranges 这两参数没有传输入口,我先转换完成测下视觉模型测试下性能
@nv-guomingz Wow, that looks great