Question about how to use it
Dear Luo, I have read this paper about how select filter to prune, I have a question that should I calculate the scores of every layer's every channel ? If so, I think it's a little inconvenient for me to use. The second question is that every layer has its top k filters to remove, how should I set this k? Should change it every dataset. Maybe I can set a k for whole network, but I don't I can get a well performance. The third question is that how can I apply KL-divergence on object detection or instance segmentation? I think the application in real world is not only the classification but more is object detection, how to apply filter prune on this model?
Maybe I have make mistake, your method is calculate the KL-divergence of the every layers' every channel?