Deep Learning Detector with Trained Model
Dear all, I have been using it recently, and I am truly impressed with its capabilities. However, I believe there is room for improvement, especially in terms of detection accuracy. I would like to suggest incorporating deep learning models to enhance the detection performance. Suggestion: Adopting deep learning models can significantly improve the detection accuracy. These models have shown remarkable results in various detection tasks, and I think they could bring substantial benefits to your project. You can refer to the following link for some excellent deep learning models that might be suitable for your use case:
https://github.com/wentaozhu/AutoShot https://github.com/soCzech/TransNetV2
Best regards,
I'm open to having the ability to plug-and-play deep learning models into PySceneDetect for detection purposes, but I'm not sure the best way to integrate that. Will definitely have a look at the links you provided. One thing I've noticed is that a lot of folks are using PySceneDetect as part of pre-processing for data ingestion pipelines, so I'm weary about introducing some kind of feedback loop there.
I'm open to extending PySceneDetect to include capabilities to use pre-trained models, but this first requires coming up with a standardized interface for them. If anyone could help drive that aspect forward, please reach out!
A recent blog post also shows how important this is for the future. It looks like TransNetV2 might be a good candidate to start with for integrating this. I haven't had time yet to dig into how the code functions, but it looks like the model itself is reasonable in size (~30 MiB or so).
Definitely worth exploring some more! I'll be doing so when I get some time, but as I mentioned already, if folks would like to help out with this, please reach out ☄️