vision icon indicating copy to clipboard operation
vision copied to clipboard

VideoReader could not decode video frames correctly if the video width is not divisible by 8

Open w238liu opened this issue 3 years ago • 0 comments

🐛 Describe the bug

VideoReader could not decode video frames correctly if the video width is not divisible by 8. Specifically, the pixels in the last columns of each frame could not be decoded correctly. Please use the code below and the data available here to reproduce the bug.

import os
from PIL import Image
from numpy.typing import NDArray
import numpy as np
import torch
from torchvision.io import VideoReader as VR


input_file = 'short_snippet_828x480.mp4'

vr = VR(input_file, stream='video', num_threads=os.cpu_count())
for idx, data in enumerate(vr):
    print(idx)
    rgb_frame: torch.Tensor = data['data']
    rgb_arr: NDArray = rgb_frame.numpy()
    rgb_arr = np.transpose(rgb_arr, [1, 2, 0])

    im_vr = Image.fromarray(rgb_arr)
    im_vr.save('short_828x480_vr_frames/frame{:0>4d}.png'.format(idx))

Expected behavior: The python code above should decode video frames and save them in short_828x480_vr_frames. However, if we look at the saved images, we can see that pixels in the last columns of each frame are not decoded. To compare, we can also decode the video using ffmpeg cli directly, i.e. running ffmpeg -i short_828x480.mp4 -pix_fmt rgb24 short_828x480_ffmpeg_frames/frame%04d.png. By inspecting the saved images in the folder short_828x480_ffmpeg_frames, we can see that the ffmpeg cli is able to decode the pixels in the last few columns correctly.

Versions

Collecting environment information... PyTorch version: 1.12.1 Is debug build: False CUDA used to build PyTorch: 11.3 ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.5 LTS (x86_64) GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 Clang version: Could not collect CMake version: Could not collect Libc version: glibc-2.31

Python version: 3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0] (64-bit runtime) Python platform: Linux-5.15.0-46-generic-x86_64-with-glibc2.17 Is CUDA available: False CUDA runtime version: No CUDA GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True

Versions of relevant libraries: [pip3] mypy==0.910 [pip3] mypy-extensions==0.4.3 [pip3] numpy==1.23.1 [pip3] pytorch-lightning==1.7.0 [pip3] torch==1.12.1 [pip3] torchaudio==0.12.1 [pip3] torchmetrics==0.9.3 [pip3] torchqa==0.2.0 [pip3] torchvision==0.13.1 [conda] blas 1.0 mkl
[conda] cudatoolkit 11.3.1 h2bc3f7f_2
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py38h7f8727e_0
[conda] mkl_fft 1.3.1 py38hd3c417c_0
[conda] mkl_random 1.2.2 py38h51133e4_0
[conda] numpy 1.23.1 py38h6c91a56_0
[conda] numpy-base 1.23.1 py38ha15fc14_0
[conda] pytorch 1.12.1 py3.8_cuda11.3_cudnn8.3.2_0 pytorch [conda] pytorch-lightning 1.7.0 pypi_0 pypi [conda] pytorch-mutex 1.0 cuda pytorch [conda] torchaudio 0.12.1 py38_cu113 pytorch [conda] torchmetrics 0.9.3 pypi_0 pypi [conda] torchqa 0.2.0 dev_0 [conda] torchvision 0.13.1 py38_cu113 pytorch

w238liu avatar Sep 16 '22 16:09 w238liu