StackOverflow Using Pretrained Wordvectors in rawText folder:

How exactly do i convert the vectors to a numpy array? Number of wordvectors is 19639(no of rows in vocab_emb_Word2vec_48.vec) while no of tokens is 22956(no of words in vocab_withIdx.dic)

Feb 22 '17 04:02 vinayakathavale

Sorry for the lack description, and I suggest you read the following codes: https://github.com/jacoxu/STC2/blob/master/software/benchmarks/Classification_ACC.m

and set the Line 2: dataset='StackOverflow'; and Line 5: Weighting = 'AE'; %TF, TFIDF or AE(Average Embedding).

then run this Classification_ACC.m, you can see that the processing at Line 31 - Line 37.

Best regards.

Feb 22 '17 04:02 jacoxu

Thanks! Can you share the link from where you got these vectors? Or did you train them yourself?

Feb 22 '17 05:02 vinayakathavale

You can get details of pre-trained word vectors in the Section 4.2 of our paper. https://arxiv.org/abs/1701.00185

Feb 22 '17 05:02 jacoxu

Thanks

Feb 22 '17 05:02 vinayakathavale