problem in running train.py
Hi,
I'm trying to run your examples on my machine through your guide like python train.py fcn_rffc4 brats_fold0 brats_fold0 600 -ch False and get the following errors:
Traceback (most recent call last):
File "/home/liuyan/CNNbasedMedicalSegmentation/train.py", line 305, in
I already have all the dependencies except my theano version is 0.8.2 while the author's suggestion is 0.9.0. It is because when I use 0.9.0 I have another error which says : ImportError: cannot import name downsample
What can I do to fix the problem and make the example running ? Could you guys give me any suggestion or clue to solve my problem please?
Thank you.
It seems to me that the problem is related to bilinear upsampling. You can try the following to work around this issue: In model_defs.py, lines 421-428:
{'i':55, 'type': 'skip', 'src': 33},
{'i':56, 'type': 'conv', 'fs': (1, 1, 1), 'nkerns': 5},
{'i':57, 'type': 'bint', 'up': 2},
{'i':58, 'type': 'skip', 'src': 43},
{'i':59, 'type': 'conv', 'fs': (1, 1, 1), 'nkerns': 5},
{'i':60, 'type': 'shortcut', 'src': 57, 'dst': 59},
{'i':61, 'type': 'bint', 'up': 2},
{'i':62, 'type': 'shortcut', 'src': 54, 'dst': 61}
replace every {'i':x, 'type': 'bint', 'up': 2} with {'i':x, 'type': 'deconv', 'fs': (3, 3, 3), 'nkerns': 5, 'up': (2, 2, 2)}. This replaces all bilinear interpolation with deconvolution, which in theory should work just as well.
Thank you. It could start training now. But a new problem is coming.
Building model, coach... input data dimensions: h: 160 w: 144 d: 128 set stats: train: 200, valid: 37, test: 37 No checkpoint available, using random initialization instead. Starting training... ERROR (theano.gof.opt): Optimization failure due to: local_useless_inc_subtensor ERROR (theano.gof.opt): node: IncSubtensor{Inc;int64:int64:}(Elemwise{add,no_inplace}.0, Reshape{1}.0, Constant{409936}, Constant{409952}) ERROR (theano.gof.opt): TRACEBACK: ERROR (theano.gof.opt): Traceback (most recent call last): File "/home/liuyan/anaconda2/lib/python2.7/site-packages/theano/gof/opt.py", line 1772, in process_node replacements = lopt.transform(node) File "/home/liuyan/anaconda2/lib/python2.7/site-packages/theano/tensor/opt.py", line 2313, in local_useless_inc_subtensor c = get_scalar_constant_value(node.inputs[0]) File "/home/liuyan/anaconda2/lib/python2.7/site-packages/theano/tensor/basic.py", line 662, in get_scalar_constant_value v.owner.op.perform(v.owner, const, ret) File "/home/liuyan/anaconda2/lib/python2.7/site-packages/theano/tensor/elemwise.py", line 839, in perform super(Elemwise, self).perform(node, inputs, output_storage) File "/home/liuyan/anaconda2/lib/python2.7/site-packages/theano/gof/op.py", line 769, in perform "Did you used Theano flags mode=FAST_COMPILE?" MethodNotDefined: ('perform', <class 'theano.tensor.elemwise.Elemwise'>, 'Elemwise', 'Did you used Theano flags mode=FAST_COMPILE? You can use optimizer=fast_compile instead.')
/home/liuyan/climin/climin/util.py:151: UserWarning: Argument named f is not expected by <class 'climin.adam.Adam'>
% (i, klass))
/home/liuyan/breze/breze/learn/base.py:39: UserWarning: Implicilty converting numpy.ndarray to gnumpy.garray
warnings.warn('Implicilty converting numpy.ndarray to gnumpy.garray')
Error allocating 47185920 bytes of device memory (out of memory). Driver report 1638400 bytes free and 4238540800 bytes total
Traceback (most recent call last):
File "/home/liuyan/CNNbasedMedicalSegmentation/train.py", line 305, in
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'. HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node. Exception TypeError: TypeError("'NoneType' object is not callable",) in <bound method CUDAMatrix.del of <cudamat.cudamat.CUDAMatrix object at 0x7f6702bc0d90>> ignored Exception TypeError: TypeError("'NoneType' object is not callable",) in <bound method CUDAMatrix.del of <cudamat.cudamat.CUDAMatrix object at 0x7f66d1d53f10>> ignored Exception TypeError: TypeError("'NoneType' object is not callable",) in <bound method CUDAMatrix.del of <cudamat.cudamat.CUDAMatrix object at 0x7f6702bc0f50>> ignored Exception TypeError: TypeError("'NoneType' object is not callable",) in <bound method CUDAMatrix.del of <cudamat.cudamat.CUDAMatrix object at 0x7f6701a01150>> ignored
Process finished with exit code 1
I'm not sure about whether it is caused by the memory of GPU. Currently, I only have 4G available. But I'm not sure about the type error. It seems like the cudamat doesn't work. Any suggestion?
I'm not sure if 4GB would be enough for that cnn with that input size. You can try running the code with fewer features or on inputs that have smaller spatial size to confirm that it's a memory problem.
I'm working on decrease the input dimensions to see what will happen. Thank you.