nodevectors icon indicating copy to clipboard operation
nodevectors copied to clipboard

Has node2vec implementation been updated to use skip-gram as default?

Open amjass12 opened this issue 4 years ago • 3 comments

Hi!,

Related to https://github.com/VHRanger/nodevectors/issues/40

I was wondering if node2vec now uses skip-gram by default (I cannot see it anywhere in the source code, but i am sure i am missing it!!)

If it hasn't, does the following line of code automatically set sg=1 if i add this?

n2v = Node2Vec(n_components=32, walklen=80, epochs=100, keep_walks=True, w2vparams={'sg':1}) 
n2v.fit(nx_graph)

I want to be sure this is correct, as when i set {'sg': 50} (just a very silly example to invoke an error), no error is thrown - and so I wonder if w2vparams={'sg':1} is actually selecting skip-gram instead of CBOW or if I am doing something incorrectly. Any advice (or the right way to do it) is appreciated :)

Secondly: instead of saving embeddings and then loading them as keyedvectors with word2vec - is there a way of converting the fitted object (n2v above) directly to a Word2Vec gensim object?

Thank you!

amjass12 avatar Nov 05 '21 13:11 amjass12

So the Node2Vec implementation here pretty transparently uses Gensim's word2vec implementation right here:

https://github.com/VHRanger/nodevectors/blob/master/nodevectors/node2vec.py#L130

So the w2vparams you pass should be passed along to the underlying gensim Word2Vec model. If there are weird issues underlying (like the sg=50 thing) it could be related to your gensim version, etc. which I'd have a hard time debugging for you

Secondly: instead of saving embeddings and then loading them as keyedvectors with word2vec - is there a way of converting the fitted object (n2v above) directly to a Word2Vec gensim object?

You could extract the Word2Vec object from the gensim object using the .model attribute of the Node2Vec class, unless I understand this question incorrectly

VHRanger avatar Nov 05 '21 15:11 VHRanger

Hi @VHRanger ,

Thank you for the quick response.

To answer your second point first - yes, .model retrieves the full gensim model (thank you, I clearly missed that).

I am still confused about point number 1 and the proper way to implement it. Indeed it. might be a Gensim version issue (3.8.0 for me) - When I add the following:

n2v = Node2Vec(n_components=32, walklen=80, epochs=100, keep_walks=True, w2vparams={'sg':50})

again - on purpose to try and invoke an error (which it doesn't) the model trains, and if i call n2v.model.sg - it shows 50. To this end, I'm very confused as it can only train CBOW or SG so i don't know which model is being trained - if when I do w2vparams={'sg':1} - i dont know if it is actually training the SG model...

Please could you clarify on the correct way to specify the w2vparams in Node2Vec - is there another way to verify it has used Skip-gram and not CBOW?

I also get this when running (Macbook) - WARNING: gensim word2vec version is unoptimizedTry version 3.6 if on windows - Is 3.6 the version that is optimized on the Mac as well? thank you!

amjass12 avatar Nov 06 '21 11:11 amjass12

Hi @VHRanger ,

I was wondering if you could take a look at the comment above? I am worried about the skip-gram parameter and want to make sure that the model is indeed training skip-gram and not CBOW: is the correct way to specify the skip-gram paramtr:

n2v = Node2Vec(n_components=32, walklen=80, epochs=100, keep_walks=True, w2vparams={'sg':1})

thank you!

amjass12 avatar Nov 23 '21 09:11 amjass12