gae hyper-parameter learning

Hi Kipf, I would like to know how you did hyper-parameter search. Would be helpful for applying this code to other datasets.

Dec 15 '18 14:12 snash4

I did a very small scale grid search around typical values for learning rate, dropout and hidden layer on a validation set.

On Sat 15. Dec 2018 at 15:53 Sheikh Nasrullah [email protected] wrote:

Hi Kipf, I would like to know how you did hyper-parameter search. Would be helpful for applying this code to other datasets.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tkipf/gae/issues/22, or mute the thread https://github.com/notifications/unsubscribe-auth/AHAcYJ9O6XMbj8k4e53gXOORMZO1uM6zks5u5QzdgaJpZM4ZUxDo .

Dec 15 '18 15:12 tkipf

thanks, Kipf for the answer. I have another trouble. I have wiki dataset, and gcn_ae model is working perfectly fine. But when I use the gcn_vae model on the same dataset, it shoots me with the error. I don't understand what causes this error. Please, can you help. The message is here:

File "train.py", line 223, in roc_curr, ap_curr, emb = get_roc_score(val_edges, val_edges_false) File "train.py", line 196, in get_roc_score roc_score = roc_auc_score(labels_all, preds_all) File "/usr/local/lib/python2.7/dist-packages/sklearn/metrics/ranking.py", line 277, in roc_auc_score sample_weight=sample_weight) . . . ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

Jan 06 '19 13:01 snash4

Looks like you have a nan or inf value somewhere in the model, e.g. in the variable sample_weight. You should be able to find the problem with a debugger.

Jan 06 '19 16:01 tkipf

I checked with the dataset. There is no problem. As i mentioned earlier, GCN_AE model works fine on this dataset, GCN_VAE model throws this error. Actually the Embedding generated is full of Nan values. What is possibly going wrong?.

Jan 08 '19 16:01 snash4

Looks like the loss might become inf or nan at some point, then you get nan-valued gradients. This might come from the KL term in the VAE loss. You can try setting this term to zero (or removing it explicitly) and see if this causes the issue.

On Tue 8. Jan 2019 at 17:37 Sheikh Nasrullah [email protected] wrote:

I checked with the dataset. There is no problem. As i mentioned earlier, GCN_AE model works fine on this dataset, GCN_VAE model throws this error. Actually the Embedding generated is full of Nan values. What is possibly going wrong?.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/tkipf/gae/issues/22#issuecomment-452365087, or mute the thread https://github.com/notifications/unsubscribe-auth/AHAcYGCOCJ9ZLzG-3pGaryC5UdbN_dz0ks5vBMlDgaJpZM4ZUxDo .

Jan 08 '19 21:01 tkipf

Thanks kipf for the suggestion and your prompt responses. Removing the only KL term does not solve it, but have to remove this as well self.z = self.z_mean + tf.random_normal([self.n_samples, FLAGS.hidden2]) * tf.exp(self.z_log_std) from model file. The common link between KL divergence and the above term is tf.exp(self.z_log_std). So removing this term from the both, the model works. Correct me if I am wrong, By removing KL, does not VAE model reduces to GAE model. Now, at this point, I can't figure out what is going wrong. Please provide some suggestions to handle this

thanks again. More info: on the wiki dataset, AE model works best with these parameters --model gcn_vae --learning_rate 0.0001 --epochs 50 --hidden1 500 --hidden2 128

Jan 09 '19 12:01 snash4

Looks like the variance (tf.exp(self.z_log_std)) is diverging for some reason. Not sure why.

You can try our follow-up model, which should be more stable: https://nicola-decao.github.io/s-vae

Jan 09 '19 13:01 tkipf

Thanks, Kipf for your help. I will use the follow-up model.

Jan 09 '19 14:01 snash4