Muzhi Zhu
Muzhi Zhu
Hello, our method is training-free, we rely on the pre-trained models SAM and DINOv2.
In forward matching, the dimension of the points in Pr is 2D. That is, each point has x and y coordinates representing its position in the image space and corresponds...
Hi, thanks for your interest. We are cleaning up the code and will release it as soon as it is ready.
+1
好像可以,我在4090上试过
python -m bitsandbytes XGPU-lite: L-229:Client configuration: use_uma:1, compute_schedule_mode:4, need_launch_kernel_admission:0, time_slice_spin_or_cv:1, enable_heart_beat:0, enable_monitor:0. XGPU-lite: L-163:func: cuInit, pid: 22366, tid: 22366, flags: 0 XGPU-lite: L-163:func: cuInit, pid: 22366, tid: 22366, flags: 0...
Thanks for your reply. BTW, is this demo support box input?
@Kuangdd01 Thanks again! One follow-up question: Is adding the fake audio input strictly necessary? It seems like, based on the official inference code, it's also possible to not insert an...
Thanks for the explanation! I’ll check the outputs with and without fake inputs if I have time and share the results here.