Gavia Gray
Gavia Gray
What version of PyTorch are you running? The speed of grouped convolutions increased a lot in the most recent versions.
The entire script takes about 400ms for me to run, and the actual inference step `y = net(x)` takes about 70ms. The `infer.py` script never calls `.cuda()` so everything is...
For completeness, I was running with pytorch version 0.4.0, and `pip freeze` gave [this](https://gist.github.com/gngdb/6ea76f104d466a2763e643ce3cdc286c). I installed the conda env following the instructions to build pytorch from source. Also, here are...
Sorry, didn't keep logs the first time. I'm running it now trying to match the settings from the paper and I'll write the full training logs to a file. Have...
Can you fit this implementation with batch size 1024 on 4 GPUs with groups=8? I've tried, and it's too big. I was going to run it with a batch size...
I've now run `groups=8` following as close as I could to the training settings described in the paper. Unfortunately, didn't match performance of the paper. Top 1 is 63.372%, Top...
Yes, it is a linear learning rate decay; the comment in this function is just wrong: https://github.com/gngdb/ShuffleNet/blob/master/imagenet/train.py#L297-L301
That's odd. Sorry, I don't have much time to look into it at the moment. What code are you using to test the accuracy? Could be something small, like the...
Sorry, don't know what could have gone wrong in that case. I'll get back to you if I find a moment to check.
Do you have any more info on how to replicate this problem? For example using the example Imagenet training script I've added?