massive-activations icon indicating copy to clipboard operation
massive-activations copied to clipboard

Which layer's activation is used?

Open iyupan opened this issue 1 year ago • 1 comments

Hello,

This is great work! And I wonder about the layer that the analyzed activations are from. The last layer?

iyupan avatar Feb 28 '24 06:02 iyupan

This is in section 2.1 Which Layers?

In LLaMA2-7B, massive activations first appear in layer 2 and remain nearly constant values until layer 30. Intriguingly, for LLaMA2-7B and 13B, massive activations emerge very rapidly from one layer of computation, e.g., layer 2 and layer 4 respectively. This means that they do not emerge as a result of gradual accumulation through many layers, and are caused by a rather different mechanism.

QiaoranC avatar Feb 28 '24 07:02 QiaoranC