AdvCAM icon indicating copy to clipboard operation
AdvCAM copied to clipboard

Questions on performance gap

Open ghost opened this issue 4 years ago • 8 comments

Hi,

Thank you for your code and paper. I've already tried your code, but I cannot achieve the same performance as the paper. Would you please help me figure out where the problem is?

In my experiments, all hyper-parameters follow the default setting in rum_sample.py and Deeplabv2. The performance comparison is as follows,

Model Initial Seed Pseudo Mask DeeplabV2 on Val
my exp. 55.6 69.2 62.9 (64.9 after crf)
paper 55.6 69.9 68.1

I guess the main difference may be caused by the hyper-parameters of Deeplabv2, could you please point out the differences between my experiments and yours that may result in the gap or provide your training script?

Thank you!

ghost avatar May 10 '21 07:05 ghost

I also have similar issue

seyeeet avatar May 15 '21 05:05 seyeeet

Hi @stickyfiner and @seyeeet, I'm sorry for the late reply.

  1. Please check if you used ImageNet pre-trained model to initialize a segmentation model.

  2. We used the following hyper-parameters:

  • batch size:10, iter_max: 30000, lr: 2.5e-4, dataset scale for training: [0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2.0]
  1. We used balanced cross entropy loss, similar to DSRG (https://github.com/speedinghzl/DSRG).

Please try these things. Thanks!

jbeomlee93 avatar May 17 '21 06:05 jbeomlee93

Hi @stickyfiner and @seyeeet, I'm sorry for the late reply.

  1. Please check if you used ImageNet pre-trained model to initialize a segmentation model.

  2. We used the following hyper-parameters:

  • batch size:10, iter_max: 30000, lr: 2.5e-4, dataset scale for training: [0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2.0]
  1. We used balanced cross entropy loss, similar to DSRG (https://github.com/speedinghzl/DSRG).

Please try these things. Thanks!

Thanks for your reply. I will try it soon.

ghost avatar May 17 '21 08:05 ghost

@stickyfiner Thanks for the information, can you also clarify if we should use SGDROptimizer or PolyOptimizer (or if any other optimizer (e.g. adam) should be used)?

and please also let us know what is the stepsize to drop the learning rate, and how much the learning rate should drops each time?

seyeeet avatar May 20 '21 22:05 seyeeet

Could you provide the code of balanced cross entropy loss which you mentioned above? THX

allenwu97 avatar Jul 20 '21 03:07 allenwu97

Hi @seyeeet, sorry for the late reply.

We used the same optimizer and learning rate scheduler with https://github.com/kazuto1011/deeplab-pytorch. Please refer to this repo.

Thanks.

jbeomlee93 avatar Jul 23 '21 06:07 jbeomlee93

Hi @allenwu97, sorry for the late reply.

We used the balanced cross entropy loss as belows.

def criterion_balance(logit, label):
    loss_structure = torch.nn.functional.cross_entropy(logit, label, reduction='none', ignore_index=255)
    
    ignore_mask_bg = torch.zeros_like(label)
    ignore_mask_fg = torch.zeros_like(label)
    
    ignore_mask_bg[label == 0] = 1
    ignore_mask_fg[(label != 0) & (label != 255)] = 1
    
    loss_bg = (loss_structure * ignore_mask_bg).sum() / ignore_mask_bg.sum()
    loss_fg = (loss_structure * ignore_mask_fg).sum() / ignore_mask_fg.sum()

    return (loss_bg+loss_fg)/2

Thank you.

jbeomlee93 avatar Jul 23 '21 06:07 jbeomlee93

Hi @allenwu97, sorry for the late reply.

We used the balanced cross entropy loss as belows.

def criterion_balance(logit, label):
    loss_structure = torch.nn.functional.cross_entropy(logit, label, reduction='none', ignore_index=255)
    
    ignore_mask_bg = torch.zeros_like(label)
    ignore_mask_fg = torch.zeros_like(label)
    
    ignore_mask_bg[label == 0] = 1
    ignore_mask_fg[(label != 0) & (label != 255)] = 1
    
    loss_bg = (loss_structure * ignore_mask_bg).sum() / ignore_mask_bg.sum()
    loss_fg = (loss_structure * ignore_mask_fg).sum() / ignore_mask_fg.sum()

    return (loss_bg+loss_fg)/2

Thank you.

It really works for me, thanks for your reply!

allenwu97 avatar Jul 26 '21 07:07 allenwu97