fix random attention for pytorch's bigbird/pegasus_bigbird

Open Bearnardd opened this issue 2 years ago • 1 comments

Fixes # (issue) https://github.com/huggingface/transformers/issues/23055

What does this PR do?

Add control over usage of random attention of BigBird based on current mode (training/eval)

Who can review?

@sanchit-gandhi @ydshieh

Apr 28 '23 23:04 Bearnardd

The documentation is not available anymore as the PR was closed or merged.

Apr 29 '23 00:04 HuggingFaceDocBuilderDev

Hi @Bearnardd Thank you for the PR.

I have one question: why def _bigbird_block_rand_mask_with_head is not modified for this pytorch BigBird file ..?

May 02 '23 12:05 ydshieh

Hi @sanchit-gandhi! I have removed the static method as I think it is the best approach.

May 03 '23 12:05 Bearnardd

Hi @Bearnardd Thank you for the PR.

I have one question: why def _bigbird_block_rand_mask_with_head is not modified for this pytorch BigBird file ..?

Thanks for the comment! To be honest I am not sure If I understand you correctly, since from what I can see this function is updated. Could you elaborate what exactly is missing?

May 03 '23 12:05 Bearnardd

Hi @Bearnardd Thank you for the PR. I have one question: why def _bigbird_block_rand_mask_with_head is not modified for this pytorch BigBird file ..?

Thanks for the comment! To be honest I am not sure If I understand you correctly, since from what I can see this function is updated. Could you elaborate what exactly is missing?

Sorry, my bad. You are right :-)

May 03 '23 12:05 ydshieh

cc @sgugger

May 06 '23 17:05 Bearnardd

I have pushed the changes @sgugger :)

May 07 '23 22:05 Bearnardd