What is common.Scale(1) means?

Open yunfanLu opened this issue 3 years ago • 1 comments

class Scale(nn.Module):
    def __init__(self, init_value=1e-3):
        super().__init__()
        self.scale = nn.Parameter(torch.FloatTensor([init_value]))

    def forward(self, input):
        return input * self.scale

When the self.scale=1, does this option does nothing? Why do we need this layer?

Jul 20 '22 13:07 yunfanLu

Is the self.scale learnable parameters 𝜆𝑥 in the paper?

Jul 20 '22 13:07 yunfanLu