magic282

Results 8 comments of magic282

+1. I simply replace '-' back to '/'. It can work, but the perp seems to be wrong after resuming training. I guess that there may be something wrong with...

@rizar I tried to replace the checkpoint code with the latest code in saveload.py. It can be loaded, but it seems the state or something is messed. INFO:blocks.algorithms:Initializing the training...

@orhanf I guess so. I was using AdaGrad. So will the dump contain the adaptive algorithms' accumulators?

@Thrandis I tried retraining with blocks 0.1.1 and 0.2.0 release, and found both of them have this problem. (I didn't load a model saved by other blocks code). @rizar I...

It seems that recently mxnet 0.9.4 fixed a bug and can get better memory perfomance for bucket models, especially for RNN.

Great! Also, I am very curious about pytorch and torch. Are their performance comparable or not?

Feel free to do what you like. The next branch does not have the inference part. The master branch does not work with the lastest mxnet.

mxnet slices the batch to do a data parallelization which causes this error.