TitleZ99 issues

Results 14 issues of


                                            TitleZ99

About the XBert

Hi thanks for this wonderful work. I am confused about the CrossAttention Module, In the code of XBERT,when layer_num>=6, the text_encoder will turn into cross attention, however it will do...

About the visualization.

Thanks for this great work and the open-soursed repo. I want to know how to visualize the result after the inference like you show in the repo in the end.

Question about the initialization of Adapter.

Thanks for this great work. I am wondering if you just randomly initialized the adaption prompts since you used zero-init attention in the L layer. I also think the multi-modal...

About the attention function

Thanks for this wonderful work. I have a question as the picture shown. When sr_ratio>1,you will do the conv first ,then plus original v and do attention function last. But...

Question about the visualization results shown in Figure5 and Figure6

Thanks for this wonderful work. This work is very inspiring. I am confused about how to get the heat-map as shown in your paper. Looking forward to your reply at...

Question about the normalization

Thanks for this inspired work. I am confused about why you choose Group Normalization as the normalization. Did you try other normalization methods like Batch Normalization? I think Batch Normalization...

a questuon about the single GPU Inference

Thanks for this great job and i'm wondering how to run inference in a 8GB single GPU,like your example showing in the readme. I tried it in my RTX2080ti with...

About the visualization

Thanks for this great work and open-sourced repo. I didn't touch the segment tasks before. I am wondering to know how to visualize the dense segment image like you show...

Dataset can not be downloaded successfully

Has anyone successfully downloaded the Flickr30k dataset provided in this article? I used azcopy but it didn't work. I really need this dataset for some research. If you can, please...