Is there any feature related to GPT-like models that can be applied to BERT-like models?

Open zhangxin81 opened this issue 1 year ago • 3 comments

Is there any fesature related to GPT-like models that can be applied to BERT-like models?

Apr 29 '24 02:04 zhangxin81

They have some common optimization idea, like fusing the multi head attention kernel, quantizing the model to int8 or fp8.

Apr 30 '24 03:04 byshiue

hI @byshiue, a related question. Does BertAttentionPlugin also use FlashAttention2 that GptAttention uses?

May 16 '24 17:05 Ashwin-Ramesh2607

Yes.

May 17 '24 07:05 byshiue