TextING icon indicating copy to clipboard operation
TextING copied to clipboard

MemoryError: Unable to allocate 168. GiB for an array with shape (76821, 542, 542) and data type float64

Open Al-Dailami opened this issue 5 years ago • 6 comments

loading training set 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 76821/76821 [02:21<00:00, 543.88it/s] Traceback (most recent call last): File "train.py", line 39, in train_adj, train_mask = preprocess_adj(train_adj) File "~/TextING/utils.py", line 153, in preprocess_adj return np.array(list(adj)), mask # coo_to_tuple(sparse.COO(np.array(list(adj)))), mask

Al-Dailami avatar Feb 22 '21 00:02 Al-Dailami

Hello

Can you help me fix this problem!!!

Al-Dailami avatar Feb 22 '21 00:02 Al-Dailami

Hi @Al-Dailami

Which dataset are you using? You may try processing training samples in batches and concatenate them with NumPy.

Magicat128 avatar Feb 22 '21 01:02 Magicat128

Thanks a lot for your reply.

I'm working in a dataset that contains around 500,000 record of short texts. Can you please help me on how to modify the code to be able to process the data in batches.

Thanks a lot in advance for your valuable help.

Al-Dailami avatar Feb 23 '21 04:02 Al-Dailami

Hello, I have modified the trainer to process data batch by batch.. Is this a right way? https://github.com/CRIPAC-DIG/TextING/blob/c2492c276a6b59ca88337e582dfd2f3616f3988d/train.py#L124

b_train_adj, b_train_mask = preprocess_adj(train_adj[idx]) b_train_feature = preprocess_features(train_feature[idx]) feed_dict = construct_feed_dict(b_train_feature, b_train_adj, b_train_mask, train_y[idx], placeholders) feed_dict.update({placeholders['dropout']: FLAGS.dropout})

Al-Dailami avatar Feb 24 '21 11:02 Al-Dailami

Hello, I have modified the trainer to process data batch by batch.. Is this a right way? https://github.com/CRIPAC-DIG/TextING/blob/c2492c276a6b59ca88337e582dfd2f3616f3988d/train.py#L124

b_train_adj, b_train_mask = preprocess_adj(train_adj[idx]) b_train_feature = preprocess_features(train_feature[idx]) feed_dict = construct_feed_dict(b_train_feature, b_train_adj, train_mask, train_y[idx], placeholders) feed_dict.update({placeholders['dropout']: FLAGS.dropout})

@Al-Dailami Yes, you can do it. And it's your b_train_mask in feed_dict rather than train_mask :)

Magicat128 avatar Feb 25 '21 02:02 Magicat128

Hello, I have modified the trainer to process data batch by batch.. Is this a right way? https://github.com/CRIPAC-DIG/TextING/blob/c2492c276a6b59ca88337e582dfd2f3616f3988d/train.py#L124

b_train_adj, b_train_mask = preprocess_adj(train_adj[idx]) b_train_feature = preprocess_features(train_feature[idx]) feed_dict = construct_feed_dict(b_train_feature, b_train_adj, b_train_mask, train_y[idx], placeholders) feed_dict.update({placeholders['dropout']: FLAGS.dropout})

Hello, I would like to ask if I am still reporting memoryerror after changing the code given by you, have you ever experienced this situation?

bp20200202 avatar Apr 23 '21 01:04 bp20200202