Does qwen2_vl_vae work?

Open lian700 opened this issue 5 months ago • 0 comments

Great work! It's a very impressive and capable multimodal model.

I was looking through the model files and noticed an implementation for qwen2_vl_vae. However, I couldn't find any corresponding experimental results or mentions of this VAE component in the Mantis paper.

I'm very interested in this aspect of the work. Could you please clarify if the qwen2_vl_vae is part of an ongoing or future exploration? Any information you can provide on its purpose and performance would be greatly appreciated.

Thank you for your time and for sharing your excellent research with the community.

Aug 06 '25 11:08 lian700