About Vit encoder output consistency during inference?

Open xiao2mo opened this issue 3 years ago • 4 comments

Hi, does the vit implementation has been fully tested in terms of consistency? I've found that the encoder output is totally different.

Mar 16 '23 02:03 xiao2mo

We have tested the consistency of VIT, you can refer to the infer example to check your usage is correct.

Mar 16 '23 11:03 zjersey

Can I have your wechat, I've got some problems in openai vit transform. Is the VIT you mentioned is the modeling_vit in huggingface? It seems that the encoder implenmentation is same as that of bert in lightseq. In other words, Why self_attention and ffn_add_norm in vit_encoder.cc.cu and bert_encoder.cc.cu are identical?

Mar 16 '23 11:03 xiao2mo

Yes, it's HuggingFace's modeling_vit.

vit and bert have the same structure except the embedding layer.

Mar 28 '23 16:03 zjersey

你好， ?xml:namespace>

Mar 28 '23 16:03 Anychnn