auxiliary output
"We employ a cross-entropy loss and add an auxiliary output head together with an auxiliary loss to the output of the penultimate fusion layer." According to your paper,the resolution of the auxiliary output is one-fourth of the input image. Do you upsample the auxiliary output to input?How do you design the auxiliary loss?
I wonder how to fine-tune DPT-Hybrid on the Pascal Context dataset,I couldn't get the results 60.46% on it.
Yes, the output of the auxiliary layer is upsampled to the shape original size using bilinear upsampling. The auxlayer is specified in the inference code. It is applied to "path_2" in the DPT base model. For fine-tuning:
- Our code is based on PyTorch encoding https://github.com/zhanghang1989/PyTorch-Encoding. This should be a good starting point.
- The hyper-parameters are listed in the paper. Please let us know if anything here is unclear, ambiguous, or missing.
Thanks for your reply!