ignite icon indicating copy to clipboard operation
ignite copied to clipboard

Improve Pascal VOC ref example: model weights

Open vfdev-5 opened this issue 5 years ago • 10 comments

🚀 Feature

When we run baseline configuration :

python -m torch.distributed.launch --nproc 2 --use_env -m py_config_runner ./code/scripts/training.py ./configs/train/baseline_resnet101.py

When model is instantiated, it downloads ImageNet pretrained weights for ResNet backbone. In above case, it downloads twice the same file. Let's set up the model such that we run the training from scratch.

For Hacktoberfest/PyDataGlobal contributors, feel free to ask questions for details if any and say that you would like to tackle the issue. Please, take a look at CONTRIBUTING guide.

vfdev-5 avatar Sep 16 '20 12:09 vfdev-5

Just a question, would the issue amount to toggling the pre-trained parameter for the model to off, or is this too much of a reduction?

Either way I am interested to take up this issue as part of Hacktoberfest. If you can assign it to me that would be great!

Edit :- Nowhere does the code mention any pre-training and default value for pre-training for torch vision models is false. Also I guess the weights for segmentation tasks come from COCO not ImageNet. Correct me if I am wrong

hershd23 avatar Sep 18 '20 16:09 hershd23

Hi @hershd23 , thanks for your interest to work on that.

would the issue amount to toggling the pre-trained parameter for the model to off, or is this too much of a reduction?

sometimes it is interesting to see model performances without using ImageNet pretrained model weights. So, I'd say not it is not too much of reduction.

Edit :- Nowhere does the code mention any pre-training and default value for pre-training for torch vision models is false. Also I guess the weights for segmentation tasks come from COCO not ImageNet. Correct me if I am wrong

There are two options: COCO pretrained weights and ImageNet pretrained weights. See here: https://github.com/pytorch/vision/blob/master/torchvision/models/segmentation/segmentation.py#L19 (pretrained_backbone=True)

Thinking more about the issue, maybe we can provide two configs:

  • from scratch
  • with ImageNet pretrained weights but model instantiation should download weights on rank 0 only process.

PS: Notes for Hacktoferfest

September is Preptember – a full month for maintainers to groom your repositories and for contributors to learn about making quality pull requests.

Contributions during September don't count toward Hacktoberfest. Only pull requests submitted between October 1st to 31st will count.

vfdev-5 avatar Sep 18 '20 20:09 vfdev-5

Thanks for assigning me this task. I'll look into the suggestions!

hershd23 avatar Sep 20 '20 10:09 hershd23

@hershd23 any updates on this issue from your side ?

sdesrozis avatar Oct 02 '20 08:10 sdesrozis

Yeah, getting to it sorry for the delay

hershd23 avatar Oct 02 '20 10:10 hershd23

Hi I was trying to install apex to my system but according to this https://github.com/NVIDIA/apex their support for windows in experimental. (My Linux laptop is at my college :( ). If there is any workaround on how to run the Pascal VOC example on my laptop, I'd really appreciate the help!

hershd23 avatar Oct 02 '20 13:10 hershd23

@hershd23 can you try it inside docker (for example, with our pytorchignite/apex-vision:latest ) ? Maybe, you can try also to install it without compiling C++ stuff, only python. Last resort is to use google Colab.

vfdev-5 avatar Oct 02 '20 13:10 vfdev-5

@hershd23 have you managed to install apex ? Feel free to ask here if you need a help with to start with this issue.

vfdev-5 avatar Oct 06 '20 21:10 vfdev-5

Hi, I have been trying docker for sometime now. I pulled the docker image from DockerHub and even built it from dockerfile but I am still facing issues. If you could tell me how to proceed with either docker or any other way that would be great. Also about running the code, I have a 6GB GPU available with me would that be enough to get around the issue?

hershd23 avatar Oct 07 '20 15:10 hershd23

Difficult to say without seeing what is exactly the issue. Could you please detail which OS and which prebuilt docker image you are using and what exactly the issue do you have.

Also about running the code, I have a 6GB GPU available with me would that be enough to get around the issue?

You can the batch_size to 4 or 6.

https://github.com/pytorch/ignite/issues/1297#issuecomment-695064086

Thinking more about the issue, maybe we can provide two configs

  • from scratch
  • with ImageNet pretrained weights but model instantiation should download weights on rank 0 only process.

For "from scratch" option, single GPU is OK. For "with ImageNet pretrained weights but model instantiation should download weights on rank 0 only process", probably, it will be a bit difficult to reproduce on 1 GPU. Have you any experience with distributed computations ?

vfdev-5 avatar Oct 07 '20 15:10 vfdev-5