X-Decoder
X-Decoder copied to clipboard
Question about box prediction
In this line (below), it seems that the code uses masks to predict boxes,
https://github.com/microsoft/X-Decoder/blob/165f8a6314ac84f5c36aaab7216f90dd97e38a43/modeling/architectures/xdecoder_model.py#L922
but in line 913, the predicted boxes are already obtained.
https://github.com/microsoft/X-Decoder/blob/165f8a6314ac84f5c36aaab7216f90dd97e38a43/modeling/architectures/xdecoder_model.py#L913
Why do not use predicted bboxes directly?