AudioMAE-pytorch
AudioMAE-pytorch copied to clipboard
decoder's input parameters
Thank you for this great repo! I noticed in the decoder block that the third parameter is calculated by dividing the decoder embedding dimension by the number of heads in the encoder. Shouldn't it be divided by the number of heads in the decoder instead? Here’s the code:
self.decoder_blocks = nn.ModuleList([ SwinBlock(decoder_embed_dim, decoder_num_heads, decoder_embed_dim // **num_heads**, mlp_ratio * decoder_embed_dim, shifted=True, window_size=4, relative_pos_embedding=True)
Thank you!