StackOverflow icon indicating copy to clipboard operation
StackOverflow copied to clipboard

Producing file formats of my data set

Open ans92 opened this issue 5 years ago • 0 comments

Hi @jacoxu, Thank you for great code. First of all I want to know that do you have any python code through which I can prepare following two files from my own data set:

  1. vocab_withIdx.dic
  2. vocab_emb_Word2vec_48.vec

When I saw your raw titles text files and vocab_withIdx.dic then I do not understand how you have prepared this. Have you performed any text preprocessing before you convert it into vocab with indexes. I would be very thankful to you for your help.

ans92 avatar Jan 24 '21 11:01 ans92