DAT icon indicating copy to clipboard operation
DAT copied to clipboard

Maybe wrong positional embedding?

Open Felix-Zhenghao opened this issue 8 months ago • 1 comments

In the implementation of DAT++, the rpe_table is initialized as:

self.rpe_table = nn.Parameter(
    torch.zeros(self.n_heads, self.q_h * 2 - 1, self.q_w * 2 - 1)
)

Shouldn't the table be torch.zeros(self.n_heads, self.q_h * 2 + 1, self.q_w * 2 + 1)?

For instance, the q_h and q_w can be 56 both. Then the (x,y) displacement is within a square: $ {(x,y) | x \in [-1,1], y\in [-1,1]} $. Each range of length 1 is divided into 56 crops. Therefore, the total number of vertices on the square is 113 x 113 rather than 111 x 111.

If use torch.zeros(self.n_heads, self.q_h * 2 - 1, self.q_w * 2 - 1), we are ignoring the boundary of the 2D square.

@Vladimir2506 am I wrong anywhere? Thank you if you can give an answer!

Felix-Zhenghao avatar May 12 '25 04:05 Felix-Zhenghao