Details in simple graph training and embedding
Hi, could you please kindly offer more details on training embedding on simple static graphs with no features? Thanks
Hi, in this case we can use one-hot indicator vectors as the features for each node. I.e., we can assign node $i$ as a vector of size |V| where all values in this vector or zero except for the entry corresponding to node $i$.
However, this leads to very sparse features and so for implementation we have a special flag called "identity_dim" for this case. If you have no node features than you can set "identity_dim" to a positive value. This flag will associate each node with a dense embedding vector, and we backpropagate gradients through these embeddings. (Mathematically, this is equivalent to representing each node by a one-hot indicator vector and then multiplying these vectors by a learned feature matrix of size |V|xidentity_dim).
Hope that makes some sense...
@williamleif Wow, this comment perfectly solves my question! When I first got this TF code, I wondered why you used trainable variables to initialize the embeddings of nodes that have no features. So you are using this method to model the multiplication of one-hot encoding with a trainable weight matrix?