Image transformation for InternVL-1.5

Open ruifengma opened this issue 1 year ago • 1 comments

I found that there is a image transformation step on load_image function from the example on huggingface repo (transformers based), but there is not any image processing on the gradio_web_server example (the performance is still fine). Then my question is that if there is a comparing research on performance difference with image processing and without image processing? Thanks in advance

May 13 '24 01:05 ruifengma

Hello, in fact, in the gradio_web_server image transformation is also done, but the code is a little different. In gradio_web_server, the image transformation is conducted by CLIPImageProcessor, which is actually aligned.

See this line of code for more detail: https://github.com/OpenGVLab/InternVL/blob/main/internvl_chat/internvl/serve/model_worker.py#L80

May 16 '24 05:05 czczup