optimum-graphcore
optimum-graphcore copied to clipboard
BART: disable returning kv states since there exists an on device cache
What does this PR do?
Given that the kv cache is on device, there is no need to return past_key_values. However, this would require overriding the forward methods of BartDecoder and BartDecoderLayer
Fixes # (issue)
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [ ] Did you make sure to update the documentation with your changes?
- [ ] Did you write any new necessary tests?
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.