DeepRobust icon indicating copy to clipboard operation
DeepRobust copied to clipboard

Cuda out of memory

Open Cyrus9721 opened this issue 3 years ago • 6 comments

Hi there, I use the latest version on Linux and I encountered this issue while running the mettack on graph, attack structure only. The dataset is 'Cora' using public split, and I set the number of perturbations to 300. Here is the following error:

RuntimeError Traceback (most recent call last) /tmp/ipykernel_50192/1901287071.py in 1 # Attack ----> 2 model.attack(features, adj, labels, idx_train, idx_unlabeled, n_perturbations=num_remove, ll_constraint=False) 3 modified_adj = model.modified_adj

/tmp/ipykernel_50192/1281517747.py in attack(self, ori_features, ori_adj, labels, idx_train, idx_unlabeled, n_perturbations, ll_constraint, ll_cutoff) 301 modified_features = ori_features + self.feature_changes 302 --> 303 adj_norm = utils.normalize_adj_tensor(modified_adj) 304 self.inner_train(modified_features, adj_norm, idx_train, idx_unlabeled, labels) 305

~/anaconda3/lib/python3.8/site-packages/deeprobust/graph/utils.py in normalize_adj_tensor(adj, sparse) 219 r_mat_inv = torch.diag(r_inv) 220 mx = r_mat_inv @ mx --> 221 mx = mx @ r_mat_inv 222 return mx 223

RuntimeError: CUDA out of memory. Tried to allocate 44.00 MiB (GPU 0; 23.70 GiB total capacity; 20.78 GiB already allocated; 36.00 MiB free; 21.49 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Please help, thanks!

Cyrus9721 avatar Jun 20 '22 13:06 Cyrus9721

Hmmm, that's somehow weird. It should not take so much memory on the Cora dataset. Have you freed other programs that are taking up GPU memory?

ChandlerBang avatar Jun 23 '22 14:06 ChandlerBang

I encountered the same issue when I used the actor dataset from pyg. Cuda out of memory. So I tried to run it on CPU with 256G RAM, but RAM was used up after 8 hours of running, and the process got killed. I set the attack budget to 5% of the total edges. Actor has 7600 nodes and 26752 edges.

Sikun-Skyler-Guo avatar Jul 17 '22 18:07 Sikun-Skyler-Guo

I encountered the same issue when I used the actor dataset from pyg. Cuda out of memory. So I tried to run it on CPU with 256G RAM, but RAM was used up after 8 hours of running, and the process got killed. I set the attack budget to 5% of the total edges. Actor has 7600 nodes and 26752 edges.

Running on CPU would be very slow. What is your GPU memory? 32 GB should be enough for Pubmed (20k nodes).

ChandlerBang avatar Jul 20 '22 03:07 ChandlerBang

I encountered the same issue when I used the actor dataset from pyg. Cuda out of memory. So I tried to run it on CPU with 256G RAM, but RAM was used up after 8 hours of running, and the process got killed. I set the attack budget to 5% of the total edges. Actor has 7600 nodes and 26752 edges.

Running on CPU would be very slow. What is your GPU memory? 32 GB should be enough for Pubmed (20k nodes).

I tried on GPU with 40G memory but failed.

Sikun-Skyler-Guo avatar Jul 21 '22 20:07 Sikun-Skyler-Guo

Are you using 3090 with cuda11? I change my environment to 1080ti with cuda10 and solve it.

P.S. cuda10 cannot work on 3090 so you have to change your gpu

Gmrider13 avatar Aug 06 '22 09:08 Gmrider13

I encountered the same issue when I used the actor dataset from pyg. Cuda out of memory. So I tried to run it on CPU with 256G RAM, but RAM was used up after 8 hours of running, and the process got killed. I set the attack budget to 5% of the total edges. Actor has 7600 nodes and 26752 edges.

Running on CPU would be very slow. What is your GPU memory? 32 GB should be enough for Pubmed (20k nodes).

I tried on GPU with 40G memory but failed.

I encountered the same issue. It is too low using RAM but runs out of memory using GPU with 48GB. The used dataset has 1,222 nodes and 16,714 edges. Do you have any solutions to accelerate the attack process or reduce memory usage?

yhzhu66 avatar Oct 09 '23 03:10 yhzhu66