Mantis
Mantis copied to clipboard
Does qwen2_vl_vae work?
Great work! It's a very impressive and capable multimodal model.
I was looking through the model files and noticed an implementation for qwen2_vl_vae. However, I couldn't find any corresponding experimental results or mentions of this VAE component in the Mantis paper.
I'm very interested in this aspect of the work. Could you please clarify if the qwen2_vl_vae is part of an ongoing or future exploration? Any information you can provide on its purpose and performance would be greatly appreciated.
Thank you for your time and for sharing your excellent research with the community.