JieQin comments

Results 21 comments of


                                            JieQin

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 1 (pid: 2762685)

sys.platform: linux Python: 3.7.3 (default, Jan 22 2021, 20:04:44) [GCC 8.3.0] CUDA available: True GPU 0,1,2,3,4,5,6,7: Tesla V100-SXM2-32GB CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.3, V11.3.109 GCC: x86_64-linux-gnu-gcc (Debian...

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 1 (pid: 2762685)

1, running command: bash scripts/dist_train.sh 0,1,3,4,5,6,7 configs/base_nir/deeplabv3plus_r101.py 2, training config is too complex.

ERROR:torch.distributed.elastic.multiprocessing.api:failed

Yes, I start it with tools/dist_train.sh and the number of gpus is 8.

Error in proposal generation.

Logically, there will be no problem of predefined size errors. You can provide the specific code location where the error is reported to facilitate locating the problem.

ValueError: could not convert string '2007_000032' to int32 at row 0, column 1.

You should check whether the input of the ''args.train_list'' is correct. The normal input should be ''voc12/train.txt" or "voc12/val.txt".

About checkpoint

You can get the model ckpt for demo in [model](https://drive.google.com/file/d/1X0oWfcpZo5bDkyFw7xiGBk_Yqx5gxhj_/view?usp=sharing).

how to get cam image?

You should get cam images by making the "make_cam_pass" as True.

forward() got an unexpected keyword argument 'step'

I have updated the code. Please retry it.

segmentation code

Thanks for following our work, the segmentation code refers to [DeepLab v2](https://github.com/kazuto1011/deeplab-pytorch).