Akim Tsvigun
Akim Tsvigun
@peregilk Good afternoon, and thank you so much for your comprehensive responses. I would like to ask you a small question, you say: _"Bert will learn an embedding for ("good"-"##ness")...
This does not seem to affect: the following code returns with a success. ``` log_probs_N_K_C = torch.Tensor([ [[0.1, 0.2, 0.3, 0.4], [0.15, 0.15, 0.3, 0.4]], [[0.1, 0.2, 0.3, 0.4], [0.15,...
I see this code is damaged. Here is the image (A.5 in the paper):
A similar question regards dropout in the FeedForward layer. You have it added twice, while in the paper they add it only in the end:
@ksolo may I kindly ask you to review it before it diverges too much from the main? thanks!
@ksolo could you please approve? Fixed all your suggestions.
Hi @ksolo, thank you! Sure, will do.
@dcbartlett fixed your suggestions. Kindly merge when you are available!
@dcbartlett kind ping here, fixed your suggestions. Could you please approve?
@dcbartlett sorry for tagging you, you approved the changes but didn't merge. May I ask you to approve it again and merge?