PySceneDetect Remove downscaling for perceptual hash

The HashDetector already resizes the input to a specific size, so we should avoid downscaling the input for this particular detector. This might involve changing the SceneDetector interface so that detectors can choose if they want a pre-scaled version of the frame, or if they need to work with the original.

The perceptual hash detector is a bit of a special case, since it requires that the input be a square for the DCT. Right now the image is scaled twice, which is not ideal, especially if the downscaled version is smaller than the DCT size (since we're throwing away information).

This doesn't matter too much for accuracy since we apply a low-pass filter on the result, but it is something we should avoid doing for the pipeline.

Jun 16 '24 23:06 Breakthrough

It's probably best to just make downscaling a per-detector option, and allow SceneManager to automatically try to apply it if a global one was set.

Sep 04 '24 01:09 Breakthrough

make downscaling a per-detector option

This makes sense. It should be pretty easy to do on the python side, but will need more changes on the cli side to parse all those parameters.

Sep 05 '24 04:09 wjs018

From my tests it seems this made things worse in general. Both performance and accuracy were better/worse in some cases, but in general it didn't seem worth the additional complexity. On most benchmarks the improvement was marginal, and in some cases far worse.

Feb 22 '25 04:02 Breakthrough