fix random attention for pytorch's bigbird/pegasus_bigbird
Fixes # (issue) https://github.com/huggingface/transformers/issues/23055
What does this PR do?
Add control over usage of random attention of BigBird based on current mode (training/eval)
Who can review?
@sanchit-gandhi @ydshieh
The documentation is not available anymore as the PR was closed or merged.
Hi @Bearnardd Thank you for the PR.
I have one question: why def _bigbird_block_rand_mask_with_head is not modified for this pytorch BigBird file ..?
Hi @sanchit-gandhi! I have removed the static method as I think it is the best approach.
Hi @Bearnardd Thank you for the PR.
I have one question: why
def _bigbird_block_rand_mask_with_headis not modified for this pytorch BigBird file ..?
Thanks for the comment! To be honest I am not sure If I understand you correctly, since from what I can see this function is updated. Could you elaborate what exactly is missing?
Hi @Bearnardd Thank you for the PR. I have one question: why
def _bigbird_block_rand_mask_with_headis not modified for this pytorch BigBird file ..?Thanks for the comment! To be honest I am not sure If I understand you correctly, since from what I can see this function is updated. Could you elaborate what exactly is missing?
Sorry, my bad. You are right :-)
cc @sgugger
I have pushed the changes @sgugger :)