Update streamlit implementation for MiniCPM-V 2.6
Compared with the streamlit implementation of 2.5, this code implementation can better play the new multi-modal capabilities of 2.6:
-
The application supports the upload and processing of text, single image, multiple images and videos, and can process different types of input according to the mode selected by the user.
-
Video frame extraction and encoding: In video mode, frames are extracted from the uploaded video through the decord library and uniformly sampled so that the model can process and generate responses. More detailed and clear variables and annotations. Convenient for learning and use
-
File upload and processing: Support users to upload pictures and videos, and perform corresponding processing according to different modes, such as displaying pictures in single picture mode, displaying multiple pictures in multi-picture mode, and processing video frames in video mode. You can switch back and forth between different media.
-
Tip: You can use the command
streamlit run ./web_demo_streamlit-minicpmv2_6.py --server.maxUploadSize 1024to adjust the maximum upload size to 1024MB or larger files. The default 200MB limit of Streamlit's file_uploader component might be insufficient for video-based interactions. Adjust the size based on your GPU memory usage.