Transformer icon indicating copy to clipboard operation
Transformer copied to clipboard

Did not multiply embedding weights by sqrt(d_model)

Open orena1 opened this issue 6 years ago • 4 comments

Hi, In this line: https://github.com/SamLynnEvans/Transformer/blob/37bf49224ccc0ab5a2c8cdb2c330ccd76628e57a/Embed.py#L12

I think you need to multiply the embedding by sqrt(d_model) image

orena1 avatar Jul 23 '19 08:07 orena1

@orena1 Hi, the implementation also didn't share the embedding weights, right?

fabrahman avatar Aug 05 '19 02:08 fabrahman

@orena1 The code actually has * math.sqrt(self.d_model) in the positional embedding class. In forward method.

fabrahman avatar Aug 05 '19 21:08 fabrahman

Did somebody know the reason for multiplying embedding weights by sqrt(d_model)?

zhangxixi0904 avatar Jan 27 '21 06:01 zhangxixi0904

@orena1 Hi, the implementation also didn't share the embedding weights, right?

Yes, the implementation didn't share the embedding weights.

wangzelin-em avatar Apr 14 '22 09:04 wangzelin-em