Frames decoded with different resolution/shape
Bug/Issue Description: When making use of the Python API, scene detection arbitrarily fails with the following AssertionError within a cloud environment:
File "/usr/local/lib/python3.7/site-packages/scenedetect/detectors/content_detector.py", line 210, in process_frame\n
self._frame_score = self._calculate_frame_score(frame_num, frame_img)
File "/usr/local/lib/python3.7/site-packages/scenedetect/detectors/content_detector.py", line 166, in _calculate_frame_score
delta_hue=mean_pixel_distance(hue, self._last_frame.hue),
File "/usr/local/lib/python3.7/site-packages/scenedetect/detectors/content_detector.py", line 33, in mean_pixel_distance
assert left.shape == right.shape
The issue does not occur persistently, sometimes scene detection finishes without issues, and sometimes it fails at a random point for the same video. The issue does not occur locally.
Required Information:
The issue cannot be replicated (and thus, scene detection succeeds) when making use of the CLI through a subprocess with the following options:
scenedetect -i ./video.mp4 -o ./output detect-adaptive list-scenes
I therefore provide the failing Python implementation instead:
from pathlib import Path
from scenedetect import AdaptiveDetector, detect
def do_detect_scenes(movie_path: Path, detector=AdaptiveDetector()):
return detect(str(movie_path), detector, show_progress=True)
Expected Behavior: I expected scene detection to succeed for a given video, as it does when making use of the CLI or when running scene detection locally.
Computing Environment:
- OS: Debian GNU/Linux 11 (bullseye)
- Python Version: 3.7.16
- OpenCV Version: 4.6.0.66
Follow-up The issue persist with the following computing environment:
- OS: Debian GNU/Linux 11 (bullseye)
- Python Version: 3.9.16
- OpenCV Version: 4.7.0.72
We've decided to circumvent the issue by making use of the CLI through a subprocess for now.
Do you happen to have a sample video you could share? This error is happening because one of the frames that was decoded isn't the correct size or color space. I haven't ever been able to reproduce this issue reliably unfortunately, so any samples you could share would be a great help.
Any more insight or information you could help provide would be great. I can also work on a patch to add more logging around this area to try and get better insight into what's going on. In particular, I would like to see what the size of the previous and current frames are (can also do this by just printing left.shape/right.shape and reporting that here).
@samlegrand-ordina are you able to share any additional information to help debug the issue you were facing? I would love to understand how to make the bug happen in order to fix it if possible. Otherwise could you close the issue if there is no follow-up required? Thanks!
Hello, I encountered this issue while using the same detector to process multiple videos of different sizes. After processing one video, the _last_frame variable doesn't seem to update. However, the first frame of the next video and the last frame of the previous video are of different sizes, leading to an assert error.
I temporarily resolved this issue by initializing a new detector for each video.
@ShenhaoZhu thank you for pointing that out, I had not considered that use case. I will investigate a fix for this in the following release.
Are you using the detect() function or a SceneManager?
@Breakthrough thank you for your reply, I am using a SceneManager with adaptive detector.
I realize now what the problem with the original code sample is, and it's that the same instance of a detector is used for each invocation of the function.
In retrospect I don't think this should be supported at for SceneManager and Detector APIs, as they are stateful and should not be shared across videos. They should each be used for contiguous chunks of a single video.
The example in the initial post with the scenedetect.detect() function can be rewritten as follows to achieve the intended behaviour:
from pathlib import Path
from scenedetect import AdaptiveDetector, detect
def do_detect_scenes(movie_path: Path, detector_type=AdaptiveDetector):
return detect(str(movie_path), detector_type(), show_progress=True)
It's also easy enough to create a detector factory function if you need to tweak parameters, or write a wrapper for the function:
def detect_with(video_path, detector_type, detector_args={}, **kwargs):
instance = detector_type(**detector_args)
return detect(video_path, instance, **kwargs)
detect_with("video.mp4", AdaptiveDetector, {"threshold": 30}, show_progress=True)
I can look into adding the latter to the official API if there is enough interest for it, or perhaps extending the existing detect() function to take either a detector instance or a detector_type/detector_args pair.
@ShenhaoZhu as an aside, may I ask why you require the SceneManager and did not use the detect() function? This will better help me understand if this should also be supported by SceneManager, but I would rather guide users to the detect() function for cases like this.
An instance of this also cropped up in another project: https://github.com/Breakthrough/DVR-Scan/issues/151 It seems some videos might have frames that are decoded at a lower resolution. I'm unsure why this is, and it doesn't seem to trip up ffmpeg much...
I don't have any samples handy, but if anyone could share a test case for this that would be greatly appreciated. In the meantime, I'll make sure that a warning is emitted and any frames that aren't the expected size are just ignored.
In v0.6.4, frames with the incorrect size will be skipped, and an error will be logged.
https://github.com/Breakthrough/PySceneDetect/commit/ef47b0f99756129b19cbddd33812fd63f4b1b9f0