Question about the FSOD paper
Thank you very much for sharing the FSOD code. I have some question when reading the FSOD paper, and I need your help:

-
Does step 1 indicates the encoder features from original image? Does the step 2 indicates the ROI pooling features?
-
What is the output of Multi-Relation Head (step 3)? Could you please tell me the shape of the output?
-
For application, we only need the output ROI from step 2. Am I right?
-
Is the output of step 1 probability? If not, why the "Attention RPN" can calculation the attention? In my understanding, the attention should be 0~1.
I am looking forward to your reply.
Good question. Waiting answers with you.
@fhong-jpg I still don't know the output of Multi-Relation Head (Question 2).
-
The step 2 does indicate the ROI pooling features.
-
For application, we need the output of step 4, because it can keep the true object and remove others.
-
The output of step 1 is not probability. The "Attention" actually means correlation.
Hope someone could give some answers for Question 2.
@NeuZhangQiang "The multi-relation detector then matches the query proposals and the support object" , according to the paper. The outputs of RoI Pooling are small matrices, like 4 by 4, then the multi-relation head "match" the matrices. For each matcing pair, I guess multi-relation head outputs probability(the pair match or not).
@fhong-jpg However, according to the picture, the output of multi-relation detector (step 3) is the input of "Match" (step 4). If the output of multi-relation head is probability, how can it be the input of "Match"?