pipeline使用无监督GNN模型出错
❓ Questions & Help
您好,我在使用cogdl中的pipelin函数进行图嵌入的时候遇到了问题。当我使用以“prone”为首的emb_models时,一切正常,而当我使用以“dgi”为首的gnn_models时,出现了一些问题。 代码如下: import numpy as np import torch from cogdl import pipeline
generate embedding by an unweighted graph
edge_index = np.array([[0, 1], [0, 2], [0, 3], [1, 2], [2, 3]])
train_mask = torch.zeros(4).bool() generator = pipeline("generate-emb", model="unsup_graphsage", num_features=8, hidden_size=4)
outputs = generator(edge_index, x=np.random.randn(4, 8)) print(outputs)
当model="dgi",“grace”,"unsup_graphsage"时,报错如下:
AttributeError Traceback (most recent call last)
/tmp/ipykernel_2068/2115343664.py in
~/.local/lib/python3.7/site-packages/cogdl/pipelines.py in call(self, edge_index, x, edge_weight) 202 dataset = NodeDataset(path=self.data_path, scale_feat=False, metric="accuracy") 203 self.args.dataset = dataset --> 204 model = train(self.args) 205 embeddings = model.embed(data.to(model.device)) 206 embeddings = embeddings.detach().cpu().numpy()
~/.local/lib/python3.7/site-packages/cogdl/experiments.py in train(args) 211 212 # Go!!! --> 213 result = trainer.run(model_wrapper, dataset_wrapper) 214 215 return result
~/.local/lib/python3.7/site-packages/cogdl/trainer/trainer.py in run(self, model_w, dataset_w) 187 return best_model_w.model 188 --> 189 final_test = self.evaluate(best_model_w, dataset_w) 190 191 # clear the GPU memory
~/.local/lib/python3.7/site-packages/cogdl/trainer/trainer.py in evaluate(self, model_w, dataset_w, cpu) 204 dataset_w.prepare_test_data() 205 final_val = self.validate(model_w, dataset_w, self.devices[0]) --> 206 final_test = self.test(model_w, dataset_w, self.devices[0]) 207 208 if final_val is not None and "val_metric" in final_val:
~/.local/lib/python3.7/site-packages/cogdl/trainer/trainer.py in test(self, model_w, dataset_w, device) 417 test_loader = dataset_w.on_test_wrapper() 418 if model_w.training_type == "unsupervised": --> 419 result = self.test_step(model_w, test_loader, _device) 420 else: 421 with torch.no_grad():
~/.local/lib/python3.7/site-packages/cogdl/trainer/trainer.py in test_step(self, model_w, test_loader, device) 490 for batch in test_loader: 491 batch = move_to_device(batch, device) --> 492 model_w.on_test_step(batch) 493 if self.eval_data_back_to_cpu: 494 move_to_device(batch, "cpu")
~/.local/lib/python3.7/site-packages/cogdl/wrappers/model_wrapper/base_model_wrapper.py in on_test_step(self, *args, **kwargs) 78 79 def on_test_step(self, *args, **kwargs): ---> 80 out = self.test_step(*args, **kwargs) 81 self.set_notes(out, "test") 82
~/.local/lib/python3.7/site-packages/cogdl/wrappers/model_wrapper/node_classification/unsup_graphsage_mw.py in test_step(self, graph) 54 pred = self.model(graph) 55 y = graph.y ---> 56 result = evaluate_node_embeddings_using_logreg(pred, y, graph.train_mask, graph.test_mask) 57 self.note("test_acc", result) 58
AttributeError: 'Graph' object has no attribute 'train_mask'
看起来似乎完成了图嵌入过程,但在test函数中出现故障。目前无法获得嵌入结果。 请问该如何解决这个问题? (PS:受实验平台的限制,我无法修改cogdl的源码,即无法将test函数注释掉以略过该问题)
Hi @Saberfish,
你可以在pipeline中手动指定no_test=True来避免调用test步骤。
Hi @cenyk1230 ,非常感谢!您的建议非常有用,现在我能正常运行模型“mvgrl”和“unsup_graphsage”,但是当我将模型换成“dgi”与“grace”时,出现了新的问题。
TypeError Traceback (most recent call last)
/tmp/ipykernel_498/4269874280.py in
~/.local/lib/python3.7/site-packages/cogdl/pipelines.py in call(self, edge_index, x, edge_weight) 203 self.args.dataset = dataset 204 model = train(self.args) --> 205 embeddings = model.embed(data.to(model.device)) 206 embeddings = embeddings.detach().cpu().numpy() 207
~/.local/lib/python3.7/site-packages/cogdl/models/nn/dgi.py in embed(self, data) 74 # Detach the return variables 75 def embed(self, data): ---> 76 h_1 = self.gcn(data, data.x, self.sparse) 77 return h_1.detach()
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 725 result = self._slow_forward(*input, **kwargs) 726 else: --> 727 result = self.forward(*input, **kwargs) 728 for hook in itertools.chain( 729 _global_forward_hooks.values(),
~/.local/lib/python3.7/site-packages/cogdl/models/nn/dgi.py in forward(self, graph, seq, sparse) 39 else: 40 if sparse: ---> 41 out = spmm(graph, torch.squeeze(seq_fts, 0)) 42 else: 43 out = torch.mm(graph, seq_fts)
~/.local/lib/python3.7/site-packages/cogdl/utils/spmm_utils.py in spmm(graph, x, actnn, fast_spmm, fast_spmm_cpu) 94 x = graph.out_norm * x 95 ---> 96 row_ptr, col_indices = graph.row_indptr, graph.col_indices 97 csr_data = graph.raw_edge_weight 98 x = fast_spmm(row_ptr.int(), col_indices.int(), x, csr_data, graph.is_symmetric(), actnn=actnn)
~/.local/lib/python3.7/site-packages/cogdl/data/data.py in row_indptr(self) 651 @property 652 def row_indptr(self): --> 653 return self._adj.row_indptr 654 655 @property
~/.local/lib/python3.7/site-packages/cogdl/data/data.py in row_indptr(self) 321 def row_indptr(self): 322 if self.row_ptr is None: --> 323 self._to_csr() 324 return self.row_ptr 325
~/.local/lib/python3.7/site-packages/cogdl/data/data.py in _to_csr(self) 285 self.weight = torch.ones(self.row.shape[0]).to(self.row.device) 286 if self[key] is not None: --> 287 self[key] = self[key][reindex] 288 289 def is_symmetric(self):
/opt/conda/lib/python3.7/site-packages/torch/tensor.py in array(self, dtype) 628 return handle_torch_function(Tensor.array, relevant_args, self, dtype=dtype) 629 if dtype is None: --> 630 return self.numpy() 631 else: 632 return self.numpy().astype(dtype, copy=False)
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
这个问题看起来似乎是从显卡缓存向cpu或内存中拷贝数据时出现的问题,我暂时不知道如何解决。 请问这个问题该如何解决呢?
Hi @Saberfish ,
我这边运行没有问题。可以列一下你用的cogdl的版本吗?
Hi @cenyk1230 ,我的cogdl版本是0.5.2
I came across same error when using cogdl==0.6.0 when algorithm is unsup_graphsage.
我这在train环节中也遇到同样的错误