QuaRot icon indicating copy to clipboard operation
QuaRot copied to clipboard

A question regarding the rotation matching pairs

Open Menace-Dragon opened this issue 1 year ago • 0 comments

SpinQuant is a subsequent work to QuaRot. However, we have noticed that the definitions of the rotation matrix pairing details differ between the two papers. In QuaRot, first, there is an online Hadamard operation (with a dimension of head_dim) before o_proj. Secondly, o_weight is fused with a Hadamard matrix ( H ) of the entire tensor dimension. These highlighted in the red box in the figure below.

image

In SpinQuant (at below figure), the online Hadamard operation before o_proj is removed. Additionally, o_weight is fused with a Hadamard matrix ( H, now i.e., image) of head_dim dimension.

image

Why is this the case? Are the rotation matrix pairings in the two papers equivalent?

Menace-Dragon avatar Aug 08 '24 05:08 Menace-Dragon