Questions on performance gap
Hi,
Thank you for your code and paper. I've already tried your code, but I cannot achieve the same performance as the paper. Would you please help me figure out where the problem is?
In my experiments, all hyper-parameters follow the default setting in rum_sample.py and Deeplabv2. The performance comparison is as follows,
| Model | Initial Seed | Pseudo Mask | DeeplabV2 on Val |
|---|---|---|---|
| my exp. | 55.6 | 69.2 | 62.9 (64.9 after crf) |
| paper | 55.6 | 69.9 | 68.1 |
I guess the main difference may be caused by the hyper-parameters of Deeplabv2, could you please point out the differences between my experiments and yours that may result in the gap or provide your training script?
Thank you!
I also have similar issue
Hi @stickyfiner and @seyeeet, I'm sorry for the late reply.
-
Please check if you used ImageNet pre-trained model to initialize a segmentation model.
-
We used the following hyper-parameters:
- batch size:10, iter_max: 30000, lr: 2.5e-4, dataset scale for training: [0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2.0]
- We used balanced cross entropy loss, similar to DSRG (https://github.com/speedinghzl/DSRG).
Please try these things. Thanks!
Hi @stickyfiner and @seyeeet, I'm sorry for the late reply.
Please check if you used ImageNet pre-trained model to initialize a segmentation model.
We used the following hyper-parameters:
- batch size:10, iter_max: 30000, lr: 2.5e-4, dataset scale for training: [0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2.0]
- We used balanced cross entropy loss, similar to DSRG (https://github.com/speedinghzl/DSRG).
Please try these things. Thanks!
Thanks for your reply. I will try it soon.
@stickyfiner Thanks for the information, can you also clarify if we should use SGDROptimizer or PolyOptimizer (or if any other optimizer (e.g. adam) should be used)?
and please also let us know what is the stepsize to drop the learning rate, and how much the learning rate should drops each time?
Could you provide the code of balanced cross entropy loss which you mentioned above? THX
Hi @seyeeet, sorry for the late reply.
We used the same optimizer and learning rate scheduler with https://github.com/kazuto1011/deeplab-pytorch. Please refer to this repo.
Thanks.
Hi @allenwu97, sorry for the late reply.
We used the balanced cross entropy loss as belows.
def criterion_balance(logit, label):
loss_structure = torch.nn.functional.cross_entropy(logit, label, reduction='none', ignore_index=255)
ignore_mask_bg = torch.zeros_like(label)
ignore_mask_fg = torch.zeros_like(label)
ignore_mask_bg[label == 0] = 1
ignore_mask_fg[(label != 0) & (label != 255)] = 1
loss_bg = (loss_structure * ignore_mask_bg).sum() / ignore_mask_bg.sum()
loss_fg = (loss_structure * ignore_mask_fg).sum() / ignore_mask_fg.sum()
return (loss_bg+loss_fg)/2
Thank you.
Hi @allenwu97, sorry for the late reply.
We used the balanced cross entropy loss as belows.
def criterion_balance(logit, label): loss_structure = torch.nn.functional.cross_entropy(logit, label, reduction='none', ignore_index=255) ignore_mask_bg = torch.zeros_like(label) ignore_mask_fg = torch.zeros_like(label) ignore_mask_bg[label == 0] = 1 ignore_mask_fg[(label != 0) & (label != 255)] = 1 loss_bg = (loss_structure * ignore_mask_bg).sum() / ignore_mask_bg.sum() loss_fg = (loss_structure * ignore_mask_fg).sum() / ignore_mask_fg.sum() return (loss_bg+loss_fg)/2Thank you.
It really works for me, thanks for your reply!