papasani\mohan/srinivas
papasani\mohan/srinivas
plus+1 i also have requirement to convert the model to onnx to use it on opencv kindly help here @zylo117 it will be a huge help !
you need to install torch cpu and set device map to cpu in model loading side @wenli135
please make it the priority @simonJJJ
@simonJJJ can you tell us otherwise how to increase throughput on qwen-vl-chat-int4 any optmization techniques please
same here @code10086web kindly open source the inference code @iFighting @wjf5203
hi @code10086web i have verified your script with my own inputs and against those same inputs with gradio demo , I am finding slightly different results @iFighting @wjf5203