arcadia
arcadia copied to clipboard
Able to chat images with multimodal apis/models
- user uploads a imiage, call a multimodal service(qwen-vl) to identify it (maybe with documentloader) and embedding the
description messageto vectorstore with extra image info. - Based on user's question in chat, we should return a special image reference