DPT auxiliary output

"We employ a cross-entropy loss and add an auxiliary output head together with an auxiliary loss to the output of the penultimate fusion layer." According to your paper,the resolution of the auxiliary output is one-fourth of the input image. Do you upsample the auxiliary output to input？How do you design the auxiliary loss?

Aug 28 '21 02:08 Nickiris

I wonder how to fine-tune DPT-Hybrid on the Pascal Context dataset,I couldn't get the results 60.46% on it.

Sep 02 '21 03:09 Nickiris

Yes, the output of the auxiliary layer is upsampled to the shape original size using bilinear upsampling. The auxlayer is specified in the inference code. It is applied to "path_2" in the DPT base model. For fine-tuning:

Our code is based on PyTorch encoding https://github.com/zhanghang1989/PyTorch-Encoding. This should be a good starting point.
The hyper-parameters are listed in the paper. Please let us know if anything here is unclear, ambiguous, or missing.

Sep 06 '21 08:09 ranftlr

Thanks for your reply!

Sep 11 '21 11:09 Nickiris