Questions about calculation of mAP

It seems mAP is not on the benchmark. How do you calcualate it. Thank you!
You can compute it offline on the validation set. Doing it by modifying this python file adding the following code section similar to following at L307:
super_metrics = ['bbox', 'segm']
if super_metrics: if 'bbox' in super_metrics and 'segm' in super_metrics: super_results = [] for bbox, segm in zip(results['bbox_result'], results['segm_result']): super_results.append((bbox, segm)) else: super_results = results['bbox_result'] super_eval_results = super().evaluate( results=super_results, metric=super_metrics, logger=logger, classwise=classwise, proposal_nums=proposal_nums, iou_thrs=iou_thr, metric_items=metric_items) eval_results.update(super_eval_results)
The online evaluation benchmark doesnot support mAP computation.
Thank you, it works!
I have another question. As Table3 shows, the memory in PCAN helps improving the Segmentation Scores of YoutubeVIS. If I want validate it on BDD100k, which part of the code should I change.
When I comment and change parts in quasi_dense_pcan_seg_refine.py. It seems the performance in BDD100K is improved.
First, do not update x with memory.
Second, only update memory on the first frame.
Finally, the score of mAP is even improved.

Do I misunderstand something?
In MOTS, we consider more MOTA metric becuase single-frame mask ap doesnot consider the object tracking accuracy. The mask ap is YTVIS is also a tube-mask ap (including the object tracking) not the mask ap in single frame. The MOTA in your new figure is only 2.2%, something going wrong here?
It is in line with my expectation that MOTA decays significantly, because I stop update the memo and x. And the memory is very important for matching.
It confuse me that the memory does not improve the mAP in image-level, as shown in my third screenshot. Intuitively, with the help of memory, the image-level performance should also be improved. However, it gets worse with temporal memory in fact. In other tasks(e.g. VOS or VSS), memory can usually improve the image-level performance. This is the structure on Figure 2.

So I want to validate the impact from Memory on image-level scores(e.g. image-level instance segmentation bbox_mAP or mask_mAP). I list my changes above, Do I make something wrong in the code? If there is something wrong, can you teach us how to validate the impact of Memory on image-level scores?
Thanks for your help.

Hi, I just got the mAP result exactly same as this one committed by 9p15p, which however doesn't fit the results in the paper. Could you tell me why this is happening? Thanks a lot !

And especially, the MOTA / FN / FP / IDs in our results vary much from the 'Scores-val' in this repository, I'm just wondering why is that happening. Could you pls tell me the probable reason for it? That could help me a lot. Thanks !
I have another question. As Table3 shows, the memory in PCAN helps improving the Segmentation Scores of YoutubeVIS. If I want validate it on BDD100k, which part of the code should I change.
When I comment and change parts in
quasi_dense_pcan_seg_refine.py. It seems the performance in BDD100K is improved.First, do not update
xwith memory.
Second, only update memory on the first frame.
Finally, the score of mAP is even improved.
Do I misunderstand something?
Hi, I'm also validating the impact of the memory acting on the per-frame mAP performance of PCAN, I'm confused on the question, exactly same as what you've pointed out.
So, did you figure it out, why is the memory not working well with the per-frame performance?