esm icon indicating copy to clipboard operation
esm copied to clipboard

How can we get attention weights from example sequence and structure?

Open what-is-what opened this issue 1 year ago • 2 comments

How can we get attention weights from example sequence and structure? There were no arguments to get attention weights in transformer blocks, unlike esm2.

what-is-what avatar Oct 08 '24 01:10 what-is-what

also interested in this feature, if available!

gelnesr avatar Oct 20 '24 05:10 gelnesr

Unfortunately, pytorch flash attention doesn't let you do this. You'll have to hack it in, we'll look into support it officially. Here's where the attention is computed, you'll just have use a pytorch implementation of attention to expose the attention matrix.

https://github.com/evolutionaryscale/esm/blob/39a3a6cb1e722347947dc375e3f8e2ba80ed8b59/esm/layers/attention.py#L62-L75

ebetica avatar Nov 11 '24 21:11 ebetica