Shimao Zhang
Results
2
issues of
Shimao Zhang
 Does this output mean the model can't find the corresponding box in the picture?
Thanks for your great work! And I notice that you utilized a non-linear layer with GELU and a LayerNorm operation and a linear layer called decoder as the voken classification...