huihui1999 comments

Results 19 comments of


                                            huihui1999

CUDA Out Of Memory Error on consecutive inferences

i set **os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'max_split_size_mb:4096' according to** https://blog.csdn.net/MirageTanker/article/details/127998036. and it works a smaller value, like 1024, is better

Poor performance of Mask2Former based on coco-stuff-164k

I also encounter the same problem of Mask2former+coco164k+beitv2-L+80k pretraining. The first 8000 iters mIou/mAcc is 37.69 | 49.31. I am trying setting sampels-per-gpu to 2

freeze the non-temporal parameters

The ckpt only provides DiT weights, is this ckpt trained with text/vae frozen? When will the fully-trained weights released?

Data

Hi, thanks the code you provided, diffusion to multiple experts is very enlightening to me. I want to replicate the results, can I have a copy of your train&test images...

Inferecnce

for RVOS/VIS, does inference needs the ground truth mask of the first frame?

RuntimeError: causal_conv1d with channel last layout requires strides (x.stride(0) and x.stride(2)) to be multiples of 8

You can skip causual_1d_conv (comment out and set the causual_conv_1d=None). It will use the nn.Conv1d function. Finally, you need to make the transposed tensor to be contiguous()

Initial Capital, 首段大写字母识别错误

https://ieeexplore.ieee.org/document/6296526

Initial Capital, 首段大写字母识别错误

[Geoffrey_Hinton-Deep_Neural_Networks_for_Acoustic_Modeling_in_Speech_Recognition_The_Shared_Views_of_Four_Research_Groups.pdf](https://github.com/user-attachments/files/18253232/Geoffrey_Hinton-Deep_Neural_Networks_for_Acoustic_Modeling_in_Speech_Recognition_The_Shared_Views_of_Four_Research_Groups.pdf)

Initial Capital, 首段大写字母识别错误

OK，ocr会出现这种情况嘛，这是auto模式的

RVOS inference code

What object queries similarity measure you used when evaluating RVOS in your appendix? Can I use simple cos distance?