Description
🐛 Describe the bug
VideoReader could not decode video frames correctly if the video width is not divisible by 8. Specifically, the pixels in the last columns of each frame could not be decoded correctly. Please use the code below and the data available here to reproduce the bug.
import os
from PIL import Image
from numpy.typing import NDArray
import numpy as np
import torch
from torchvision.io import VideoReader as VR
input_file = 'short_snippet_828x480.mp4'
vr = VR(input_file, stream='video', num_threads=os.cpu_count())
for idx, data in enumerate(vr):
print(idx)
rgb_frame: torch.Tensor = data['data']
rgb_arr: NDArray = rgb_frame.numpy()
rgb_arr = np.transpose(rgb_arr, [1, 2, 0])
im_vr = Image.fromarray(rgb_arr)
im_vr.save('short_828x480_vr_frames/frame{:0>4d}.png'.format(idx))
Expected behavior:
The python code above should decode video frames and save them in short_828x480_vr_frames
. However, if we look at the saved images, we can see that pixels in the last columns of each frame are not decoded. To compare, we can also decode the video using ffmpeg cli directly, i.e. running
ffmpeg -i short_828x480.mp4 -pix_fmt rgb24 short_828x480_ffmpeg_frames/frame%04d.png
.
By inspecting the saved images in the folder short_828x480_ffmpeg_frames
, we can see that the ffmpeg cli is able to decode the pixels in the last few columns correctly.
Versions
Collecting environment information...
PyTorch version: 1.12.1
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.5 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.31
Python version: 3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.15.0-46-generic-x86_64-with-glibc2.17
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Versions of relevant libraries:
[pip3] mypy==0.910
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.23.1
[pip3] pytorch-lightning==1.7.0
[pip3] torch==1.12.1
[pip3] torchaudio==0.12.1
[pip3] torchmetrics==0.9.3
[pip3] torchqa==0.2.0
[pip3] torchvision==0.13.1
[conda] blas 1.0 mkl
[conda] cudatoolkit 11.3.1 h2bc3f7f_2
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py38h7f8727e_0
[conda] mkl_fft 1.3.1 py38hd3c417c_0
[conda] mkl_random 1.2.2 py38h51133e4_0
[conda] numpy 1.23.1 py38h6c91a56_0
[conda] numpy-base 1.23.1 py38ha15fc14_0
[conda] pytorch 1.12.1 py3.8_cuda11.3_cudnn8.3.2_0 pytorch
[conda] pytorch-lightning 1.7.0 pypi_0 pypi
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torchaudio 0.12.1 py38_cu113 pytorch
[conda] torchmetrics 0.9.3 pypi_0 pypi
[conda] torchqa 0.2.0 dev_0
[conda] torchvision 0.13.1 py38_cu113 pytorch