HAT icon indicating copy to clipboard operation
HAT copied to clipboard

There is a bug?(Maybe) when deal with the dataset in the size of 240*240 with the window_size 15

Open hujb48 opened this issue 2 years ago • 3 comments

Hi, Thanks for the great work to solve the SR problem via the multi-scale QKV attention module! When I train the model with my own datasets that the resolution of the LR imgs is 6060, while the HR imgs is 240240, so I need to change the 'window_size' in the options/train/xxx.yml as 15. After this operation, I met some problems in class OCAB(), that in about line 394 in hat_arch.py: def forward(self, x, x_size, rpi): .... kv_windows = self.unfold(kv) .... the kv_widows is in shape of [4, 174240, 9], that nw*b of k and v is 36 that is not be same as q which is 64.

I found that the problem would happend in nn.unfold(), when window size is odd, padding and kernel size could changed as: class OCAB(nn.Module): .... def forward(self, x, x_size, rpi): .... #self.overlap_win_size = int(window_size * overlap_ratio) + window_size self.overlap_win_size = int(math.ceil(window_size * overlap_ratio)) + window_size .... #self.unfold = nn.Unfold(kernel_size=(self.overlap_win_size, self.overlap_win_size), stride=window_size, padding= (self.overlap_win_size-window_size)//2) self.unfold = nn.Unfold(kernel_size=(self.overlap_win_size, self.overlap_win_size), stride=window_size, padding=-(-(self.overlap_win_size-window_size)//2)) .... and class HAT(nn.Module): .... def calculate_rpi_oca(self): #window_size_ext = self.window_size + int(self.overlap_ratio * self.window_size) window_size_ext = self.window_size + int(math.ceil(self.overlap_ratio * self.window_size))

The model is trained normally, but I don't know if these corrections are right or not?

hujb48 avatar Mar 25 '23 10:03 hujb48

@hujb48 When the overlap_ratio=0.5, the window size is best divisible by 4. This can be shown in the latest manuscript. Your modification does not seem to cause problems, but it slightly breaks the symmetry of the cross-attention, because the padding/cropping size is not the same around the window.

chxy95 avatar Mar 26 '23 10:03 chxy95

@chxy95 Thx for your prompt reply! Or I could interpret it that it's better to use overlap_ratio = 1/3 or 2/3 when the window size is 15 to avoid breaking the symmetry of the cross-attention, right? Thank you again for your prompt reply! I am looking forward to your following great work!

hujb48 avatar Mar 26 '23 11:03 hujb48

@hujb48 Using overlap_ratio=2/3 is OK. It means 5 pixels (total 15 * 2/3 = 10) extended around the window.

chxy95 avatar Mar 26 '23 12:03 chxy95