NExT-OE icon indicating copy to clipboard operation
NExT-OE copied to clipboard

How to get the word representations

Open datar001 opened this issue 4 years ago • 3 comments

Hi, I have a simple question. How to get the glove_embed.npy and vocab.pkl if there is a new dataset. To get the glove_embed.npy, can we need to train new word vectors on the vocabulary build by us? And if possible, can you release the pre-processing nlp code? Thanks very much.

datar001 avatar Jun 18 '21 08:06 datar001

Hi, pls refer to build_vocab.py and word2vec.py.

doc-doc avatar Jun 18 '21 08:06 doc-doc

wow, maybe i am a blind man.... And, If the dim=-2 rather than -1 in Line 45 in networks/VQAModel/HGA.py, the performance will get a small improvement. This seems an implement mistake in HGA. There is only a value in the last dimension. If the softmax_dim=-1, the value will be all 1.

datar001 avatar Jun 18 '21 09:06 datar001

Thx for pointing it out. The code follows the official repo, and yes that the attention-pooling is changed to sum-pooling here. I will fix it soon.

doc-doc avatar Jun 18 '21 09:06 doc-doc