LocNet with 11G memory still "Check failed: error == cudaSuccess (2 vs. 0) out of memory"

I have a K40 with >11G memory, but when I run demo_LocNet_object_detection_pipeline, it reminds me Check failed: error == cudaSuccess (2 vs. 0) out of memory. I thought 11G is enough because in readme only required 6G. Why is that ?

Sep 22 '16 07:09 litingfeng

Did you build Caffe with cuDNN library? I think without it, Caffe uses much more GPU memory when applying the convolutional layers. That is probably why you run out of GPU memory. Could you check it out?

Sep 23 '16 16:09 gidariss

@gidariss Yes, I did compile with cudnn. I noticed that after run the first network (rec), it used 6G memory, and when running the second network ,the error showed up. Do I need to free GPU memory after the first network ? How?

Sep 23 '16 16:09 litingfeng

@litingfeng No, you do not need to free GPU memory after the first network. What you can do is in the demo_LocNet_object_detection_pipeline.m script to change the lines 90 and 91 from: model_obj_rec_max_rois_num_in_gpu = 500; model_obj_loc_max_rois_num_in_gpu = 400; to model_obj_rec_max_rois_num_in_gpu = 200; model_obj_loc_max_rois_num_in_gpu = 200;

I just tried it and I manage to run the demo on a 6Gbyte GPU. Could you tried as well and let me know?

Spyros

Sep 23 '16 17:09 gidariss

@gidariss I gitted a new one, but it still can't work. I even tried 50,100, all run out of memory. When I was running , I checked GPU usage with nvidia-smi . It turned out there was indeed 11408MB has been used. Thank you for your patience.

Sep 24 '16 01:09 litingfeng

Later today, I tried script_test_object_detection_pipeline_PASCAL.m. It works without any modification. So strange.

Sep 24 '16 10:09 litingfeng

It seems that in demo, you did not caffe.reset_all(); after each network.

Sep 26 '16 06:09 litingfeng

@litingfeng Regarding the script_test_object_detection_pipeline_PASCAL.m, it uses a single model (either the recognition or the localization model) at a time and that is why you do not have any problem running it. However, the LocNet_object_detection() function, which is used in the demo, uses both models simultaneously. Do you mean that you placed caffe.reset_all() calls inside the LocNet_object_detection() function?

As I said, I did not have any problem running the demo on a 6Gbyte GPU. So it is strangle that in your case it cannot run in 11Gbyte GPU.

Sep 27 '16 09:09 gidariss