ROS Real Time Inference
Hi, I've a few questions after looking at your ROS implementation! Similar to you I'm using MaskRCNN w/ DenseFusion (but trained both on my own synthetic images).
I was able to implment my own ROS node after a day of configuring my conda env which required:
- a compatable version of MaskRCNN w/ Python2.7 &
- Merging my tensorflow 1.14.0 (MaskRCNN) and pytorch (DenseFusion) envs ..
I'm able to run this live w/ a ZED camera on a RGB-D image to get an inference time of ~2fps. I run MaskRCNN and DenseFusion live on seperate GPUs & my setup is 2x GeForce RTX 2080 Ti's
I understand that your pipleline is more complex than grabbing a ZED RGB-D image-> Segmentation -> Pose Estimation but I'd curious as to how I can decrease inference time for MaskRCNN.

One thing I noticed with your repo is that you run MaskRCNN for e.g. 100 frames and save masks for DenseFusion. I believe MaskRCNN is faster when operating on batch rather than one image per se.
Any feedback is appreciated!
Hi, It should be faster if you put multiple images into a batch and operate the batch. In addition, there is number of things to consider if you want to reduce time execution such as use a smaller backbone network, use smaller images, reduce the maximum number of instances per image, ... Good luck!