interrupting model evaluation
It would be really useful if I could safely abort the fitting of a model with ctrl-c, when training on the gpu I have to kill the whole process using kill, which makes me loose all other data in memory.
Are you proposing a ctrl-c and jumping into a Julia REPL with all the environment setup remaining in the memory like in a debugger? I agree that this would be super useful. But I think this is a very complicated project. Maybe you could check out if there is any existing good debugger available for Julia.
no, i meant just interrupting:
julia> for i in 1:10000000
sleep(1)
end
C-c C-cERROR: InterruptException:
in process_events at ./stream.jl:713
in wait at ./task.jl:360
in wait at ./task.jl:286
in sleep at stream.jl:678
[inlined code] from none:2
in anonymous at no file:0
This does not work if I run something on the gpu, I think it generally does not work if julia is calling some longer computation from an external library.
But I agree, a real debugger would be extremely useful, there is the Debug package (http://github.com/toivoh/Debug.jl), which implemented some macros for an interactive debugger. It is quite useful if you implement pure julia code.
I get this error when running with a batch number - but it outlines a debugging method:
julia cnn.jl (128,128,1,10) (128,128,1,10) INFO: Start training on [GPU0] INFO: Initializing parameters... INFO: Creating KVStore... INFO: Saved checkpoint to './eval/G3DBmodel-0000.params' INFO: Start training... [08:08:01] .julia/v0.4/MXNet/deps/src/mxnet/dmlc-core/include/dmlc/././logging.h:241: [08:08:01] src/storage/./gpu_device_storage.h:39: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading CUDA: out of memory [08:08:01] .julia/v0.4/MXNet/deps/src/mxnet/dmlc-core/include/dmlc/logging.h:241: [08:08:01] src/engine/./threaded_engine.h:295: [08:08:01] src/storage/./gpu_device_storage.h:39: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading CUDA: out of memory An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPEto NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging. terminate called after throwing an instance of 'dmlc::Error' what(): [08:08:01] src/engine/./threaded_engine.h:295: [08:08:01] src/storage/./gpu_device_storage.h:39: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading CUDA: out of memory An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPEto NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging. [08:08:01] .julia/v0.4/MXNet/deps/src/mxnet/dmlc-core/include/dmlc/././logging.h: signal (6): Aborted 241: [08:08:01] src/storage/./gpu_device_storage.h:39: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading CUDA: out of memory [08:08:01] .julia/v0.4/MXNet/deps/src/mxnet/dmlc-core/include/dmlc/logging.h:241: [08:08:01] src/engine/./threaded_engine.h:295: [08:08:01] src/storage/./gpu_device_storage.h:39: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading CUDA: out of memory An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPEto NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging. terminate called recursively
Also I find cnt+c works when I either run using vim :!julia % or from terminal.