InterpretDL icon indicating copy to clipboard operation
InterpretDL copied to clipboard

grad-CAM or score-CAM visualization with a Mask R-CNN model

Open jessecanada opened this issue 5 years ago • 6 comments

Hi there. I am trying to implement CAM visualizations with a Mask R-CNN model. As you know, Mask R-CNN performs classifications per ROI, but the backbone network (for ex. a FPN with Resnet50 conv blocks) extracts features over the entire input image. Could you provide some guidance as to how to use InterpretDL to generate CAM with a Mask R-CNN model? Much appreciated!

jessecanada avatar Jan 16 '21 05:01 jessecanada

Hi, @jessecanada

We have done something similar for a YOLO model of PaddleDec. Let me find the code and see if we can do the visualization directly for a Mask R-CNN model.

I'll get back to you on Monday or Tuesday ;)

holyseven avatar Jan 16 '21 07:01 holyseven

Hi, @jessecanada

We've tried Grad-CAM on a Mask R-CNN model based on PaddleDetection. We are able to visualize/interpret a bounding box prediction and its confidence with respect to the ROI. One of the visualizations looks like this: image

image

We are still trying to figure out how to incorporate object detection tasks into InterpretDL, but here is how we implemented it based on PaddleDetection:

  1. Modify the get_prediction function so that it also outputs cls_prob and bbox_pred.
  2. In mask RCNN code, outputs roi_feat in single_scale_eval, and in build, calculate and output the gradients of cls_prob or bbox_pred with respect to roi_feat.
  3. In tools/infer.py, comment out test mode program so that gradients can be calculated without error, and then save gradients and roi_feats for visualization.
  4. Run tools/infer.py by specifying architecture, weights and image.

If you have any further questions, please let us know!

XuanyuWu123 avatar Jan 21 '21 09:01 XuanyuWu123

Hi @XuanyuWu123

I'm searching how to visualize heatmaps on Mask R-CNN. Could you teach me how to implement Grad-CAM on a Mask R-CNN?

thanks

nomurakeiya avatar Jul 13 '21 22:07 nomurakeiya

Hi @nomurakeiya @XuanyuWu123 Do you figure out how to implement grad-cam on Mask R-CNN?

Kartiky246 avatar Jan 05 '22 09:01 Kartiky246

Hello all,

Thanks for the interests in our repo.

For the implementation of Grad-CAM on Mask R-CNN, there are several points need to be clarified:

  • There are three outputs of a Mask R-CNN: the bounding box coordinates, the cls prediction of this box, and the mask.
  • It is easy to get the explanation on the cls prediction of a certain box, that should be the heatmap computed by Grad-CAM or other algorithms. However, explanations on the bounding box coordinates or the mask, are not well defined. Please tell us if you desired other explanation results.
  • For the heatmap, one problem is that in eval mode, a NMS bbox_post_process is done for Mask R-CNN, which stops the computation of gradients. So that the final outputs can not be explained directly.
  • But we can still compute the gradients of raw outputs (where bbox_head outputs 1000 boxes). This is possible for computing Grad-CAM from here.
  • For PaddleDetection, we am still thinking about how to explain the model directly with our tool. At this moment, a possible way is to modify the source code of PaddleDetection, gets the feature map and gradients of a certain layer, and then compute Grad-CAM.

Let us know if there are more questions.

Cheers

holyseven avatar Jan 12 '22 09:01 holyseven

For anyone who is still interested in obtaining the explained heatmap for Mask R-CNN models or YOLO-like models, we have given a tutorial showing the visualization results. Hope this can help.

holyseven avatar Jul 11 '22 08:07 holyseven