李振斌 comments

Results 10 comments of


                                            李振斌

Streaming output support?

@NiYueLiuFeng Can you share your `modeling_internvl_chat.py`？ Thank you very much

[Bug] relatively slow speed after deploy InternVL2-26B

@irexyc Thank you for your reply. this is `profile_restful_api_image.py` I'll try `--vision-max-batch-size ` later ```python import csv import json import random import time from queue import Queue from threading import...

[Bug] relatively slow speed after deploy InternVL2-26B

@irexyc This is how I feel after using '--vision-max-batch-size' as I would have without it ```shell lmdeploy serve api_server --cache-max-entry-count 0.6 /home/notebook/data/personal/W9088934/InternVL2-26B/ --server-port 23333 --vision-max-batch-size 8 python profile_restful_api_image.py http://127.0.0.1:23333 /home/notebook/data/personal/W9088934/InternVL2-26B...

[Bug] relatively slow speed after deploy InternVL2-26B

@irexyc According to your comments I modified the test script and added it using the startup service `--log-level INFO` ```python import asyncio import base64 import csv import json import random...

[Bug] relatively slow speed after deploy InternVL2-26B

@irexyc Do you have plans to optimize this problem in the future

[Bug] relatively slow speed after deploy InternVL2-26B

@irexyc Okay, thanks for your reply. I'll try the `TensorRT` deployment vision model.

[Bug] relatively slow speed after deploy InternVL2-26B

@irexyc 我如果分离视觉模型和语言模型，我应该如何把视觉模型的输出加入到语言模型的prompt中，我查看相关原始代码，input_embeddings input_embedding_ranges 是有关图片features的，我应该如何把这些信息通过 openai.client 加入到请求中

[Bug] relatively slow speed after deploy InternVL2-26B

没有转换好呢，我现在在测试，测试过程中，我想到这个问题，看了下lmdeploy 相关源代码，input_embeddings input_embedding_ranges 这两参数没有传输入口，我先转换完成测下视觉模型测试下性能

InternVL-Chat-V1.5 support

mark

InternVL-Chat-V1.5 support

@nv-guomingz Wow, that looks great