PySceneDetect Text recognition

Problem/Use Case

I would love to get the true shot numbers from a video burnin, via OCR text recognition or similar. It's a shame to recognise only the cut points, and throw away the text notations within the image.

Solutions

Visually specify a crop region to scan, then make a record of the text within the box per clip and record in some extra attributes.

Proposed Implementation:

Provide a recognise-text flag to enable OCR on the process.

Alternatives:

Run a second process separately to do the OCR

May 19 '25 10:05 rossisbudda

If the intention is to get a time base for the cuts, then doesn't the OCR only need to be done on the first frame for a video? Is there anything preventing automating this in another way, say having a script query the timecode from the first frame, and then shifting the output of PySceneDetect accordingly?

May 19 '25 23:05 Breakthrough

Yes that's probably a good way to handle it on the first frame of each cut. It is certainly automatable as a separate process. My suggestion is to consider inbuilding this functionality, as I guess most people using PySceneDetect would love it if the cuts picked up useful naming rather than only an indexed identifier (1,2,3,4,5 could also gather sc01_sh010, sc01_sh020, sc02_sh020, etc)

May 20 '25 10:05 rossisbudda