Questions about HSN tutorial

Open georg-bn opened this issue 2 years ago • 1 comments

I have a couple of questions/found bugs regarding the HSN tutorial (and hence might impact other tutorials in the simplicial domain).

https://github.com/pyt-team/TopoModelX/blob/18956deb499c58062b0d435df8ddc85fc13b6634/tutorials/simplicial/hsn_train.ipynb#L326 should be self.layers = torch.nn.ModuleList(layers), so that the parameters get properly registered.
https://github.com/pyt-team/TopoModelX/blob/18956deb499c58062b0d435df8ddc85fc13b6634/tutorials/simplicial/hsn_train.ipynb#L355 should probably not have softmax, as later binary crossentropy on logits is used: https://github.com/pyt-team/TopoModelX/blob/18956deb499c58062b0d435df8ddc85fc13b6634/tutorials/simplicial/hsn_train.ipynb#L415
https://github.com/pyt-team/TopoModelX/blob/18956deb499c58062b0d435df8ddc85fc13b6634/tutorials/simplicial/hsn_train.ipynb#L120 "Here, we have in_channels = channels_nodes $ = 34$. This is because the Karate dataset encodes the identity of each of the 34 nodes as a one hot encoder." This seems to be incorrect as we get 2 dim features: https://github.com/pyt-team/TopoModelX/blob/18956deb499c58062b0d435df8ddc85fc13b6634/tutorials/simplicial/hsn_train.ipynb#L145 and they are eigenvectors from the graph as defined in https://github.com/pyt-team/TopoNetX/blob/4c47ec24047a7af83d5a249a79c1945e7043ceea/toponetx/datasets/graph.py#L38 .

Jun 25 '23 07:06 georg-bn

Some more things about the training loop.

If as suggested above in 2., the softmax is removed, then the checks y_hat > 0.5 need to be replaced by y_hat > 0. Also for using binary_cross_entropy_with_logits it is probably most convenient to let the model output a 1D vector of logits instead of 2D as is done currently.
There's a crucial typo where y_pred[-len(y_train) :] should instead be y_pred[:len(y_train)] here: https://github.com/pyt-team/TopoModelX/blob/ed4bd966a2cb3969466f60aabd74fa7a08247ba8/tutorials/simplicial/hsn_train.ipynb#L423

Jul 11 '23 16:07 georg-bn