DA-RNN icon indicating copy to clipboard operation
DA-RNN copied to clipboard

python:free() invalid pointer when set is_kfusion=True

Open JackHenry1992 opened this issue 8 years ago • 18 comments

I have ran successful your code with set is_kfusion=false. Now I want to ran your kinect_fusion.cpp with set this flag to true, but I got error image Have you encoutered same error as me? Could you give some suggestions? In order to avoid pangolin error, I have comment all pangolin code in kinect_fusion.cpp

Supplement: I also got python: free(): invalid next size (fast) if run test_kinect_fusion.sh on native notebook,, and found that code crash down in initMarchingCubesTables() of create_tensors() by std::cout info image

Can you give more methods to test kinect_fusion code (like Video_$1.pango dataset in kinect_fusion/run.sh)?

Another try I have modified kinect_fusion.cpp/main() image-input-interface and use cv2.imread to replace VideoInput as follows image Then direct run main() function by cmd and error shows cuda_error in initMatchingCubes() image

It seems that this error is same with running test_kinect_fusion.py. So all this errors caused by cuda? The CUDA version I installed is cuda-8.0

JackHenry1992 avatar Jul 24 '17 09:07 JackHenry1992

Have you tried it with Python > 3 ?

kevinkit avatar Jul 24 '17 10:07 kevinkit

Hi, @kevinkit , thank you very much , I will try it later. Have you run DA-RNN successful with kinect_fusion ? Another error I have encoutered is that LD_PRELOAD can not found libtcmalloc.so.4, could you also give some suggestion?

JackHenry1992 avatar Jul 24 '17 14:07 JackHenry1992

I had this issue, when running under ubuntu 14.04 , are you running it on ubuntu 16.04?

patrickESM avatar Jul 24 '17 16:07 patrickESM

@JackHenry1992 Like @D0nBilb0 said, this was our case, and no we are still stuck on #7 , even on a native machine

kevinkit avatar Jul 24 '17 16:07 kevinkit

@D0nBilb0 , I am running on ubuntu16.04 docker container and encoutered this error. And also trying it on official tensorflow docker (ubuntu16.04, python2), same error about free() invalid pointer. Then I run test_kinect_fusion.sh on my native notebook (ubuntu14.04, python2). Native notebook can build kinect_fusion ok, but the same error of python free(): invalid next size when run test_kinect_fusion.sh script. But have not test on native notebook of ubuntu16.04.

@kevinkit , after configured python3 and tried building this setup.py, I got some errors that show this code is python2 style. Another things is that I can build #7 successful on native computer, but I can't run da-rnn training caused OOM, so if you have enough GPU memory on your native machine, I think you can run it ok.

@yuxng , it will be very grateful to us for your advice, is free() invalid pointer caused by tcmalloc ? Can you give more methods to test kinect_fusion code (like Video_$1.pango dataset in kinect_fusion/run.sh)?

JackHenry1992 avatar Jul 25 '17 01:07 JackHenry1992

@JackHenry1992 regarding to your free() problem, have you checked the tensorflow version #2 ?

Can you maybe give all the steps needed to get it to run on ubuntu 14.04 , we tried that - however we came across many things that needed to be changed, I opened another Issue for that: #10

kevinkit avatar Jul 25 '17 07:07 kevinkit

@kevinkit , I just run test_kinect_fusion.py (don't run tensorflow) and also got error. And try direct run executable file (build/kinectFusion) also got error , seems that cuda run error. image

JackHenry1992 avatar Jul 25 '17 09:07 JackHenry1992

What is the compute capablity of your GPU? I read that in some cases textures may not work on smaller compute capablities

kevinkit avatar Jul 25 '17 09:07 kevinkit

This is my notebook gpu params image

JackHenry1992 avatar Jul 25 '17 10:07 JackHenry1992

You can access the details, e.g. compute capability under: https://developer.nvidia.com/cuda-gpus. Your GPU (GeForceGTX 960 M) has a compute capability of 5.0 , a good look what this gpu supports is given here: https://en.wikipedia.org/wiki/CUDA . There are some drawbacks reagring textures with this compute capability (Cache working set per multiprocessor for texture memor,...), , that may not happen at a higher compute capability ( @yuxng used a Titan 1080, which has compute capablity 6) - however this may not be the source of error

kevinkit avatar Jul 25 '17 10:07 kevinkit

@JackHenry1992 refering to your problem with LD_PRELOAD, we get the same error but only as a warning. However, when we tried to start the scripts there were other dependencies which needed to be installed, too (opencv, scipy, Pillow, yaml)

pip install scipy pip install opencv-python pip install Pillow pip install pyyaml

kevinkit avatar Jul 25 '17 17:07 kevinkit

Using tcmalloc speeds the tensorflow training. Otherwise, I saw tensorflow slow down after iterations. However, I also see that using tcmalloc in testing crashed Pangolin. So you can disable tcmalloc when you run kinect fusion in testing.

yuxng avatar Jul 25 '17 18:07 yuxng

Thank you for your reply, we ran into similiar errors when trying to run the test script, can you tell how to disable tcmalloc when runing your test scripts?

We ran into the same error, with kinect_fusion enabled

kevinkit avatar Jul 25 '17 19:07 kevinkit

Do NOT issue the command "export LD_PRELOAD=/usr/lib/libtcmalloc.so.4" when you run the scirpt.

yuxng avatar Jul 25 '17 21:07 yuxng

So basically, if my LD_PRELOAD is empty, I should be good to go?

If I simply run

./experiments/scripts/rgbd_scene_multi_rgbd_test.sh 0

which does NOT issue the command...

In a new terminal (so LD_PRELOAD was NOT set by anything before), I still get free() invalid pointer error.

kevinkit avatar Jul 26 '17 07:07 kevinkit

I have tried DA-RNN in GeoForce 1050(ubuntu16.04), which compability>6, and cannot run kinectfusion...

@yuxng , TITAN x gets the same error (run test_kinect_fusion.py). Can you give details method to use kinect_fusion code? Or the videoinput dataset

JackHenry1992 avatar Jul 26 '17 11:07 JackHenry1992

Hi @JackHenry1992 , I am wondering if you have solved free() issue that you mentioned. I am trying to reproduce the framework and encountered the same problem. Any comments are appreciated. .

Wei2624 avatar Jun 17 '18 19:06 Wei2624

Hi @JackHenry1992 , I am wondering if you have solved free() issue that you mentioned. I am trying to reproduce the framework and encountered the same problem. Any comments are appreciated. .

I encountered the same problem too. My GPU is RTX2080Ti with compute capability 7.5

Dinghow avatar May 09 '19 00:05 Dinghow