PySceneDetect Deep Learning Detector with Trained Model

Dear all， I have been using it recently, and I am truly impressed with its capabilities. However, I believe there is room for improvement, especially in terms of detection accuracy. I would like to suggest incorporating deep learning models to enhance the detection performance. Suggestion: Adopting deep learning models can significantly improve the detection accuracy. These models have shown remarkable results in various detection tasks, and I think they could bring substantial benefits to your project. You can refer to the following link for some excellent deep learning models that might be suitable for your use case:

https://github.com/wentaozhu/AutoShot https://github.com/soCzech/TransNetV2

Best regards,

May 22 '25 04:05 Cong-Lee

I'm open to having the ability to plug-and-play deep learning models into PySceneDetect for detection purposes, but I'm not sure the best way to integrate that. Will definitely have a look at the links you provided. One thing I've noticed is that a lot of folks are using PySceneDetect as part of pre-processing for data ingestion pipelines, so I'm weary about introducing some kind of feedback loop there.

I'm open to extending PySceneDetect to include capabilities to use pre-trained models, but this first requires coming up with a standardized interface for them. If anyone could help drive that aspect forward, please reach out!

Sep 04 '25 23:09 Breakthrough

A recent blog post also shows how important this is for the future. It looks like TransNetV2 might be a good candidate to start with for integrating this. I haven't had time yet to dig into how the code functions, but it looks like the model itself is reasonable in size (~30 MiB or so).

Definitely worth exploring some more! I'll be doing so when I get some time, but as I mentioned already, if folks would like to help out with this, please reach out ☄️

Nov 12 '25 00:11 Breakthrough