Skip to content

VideoReader could not decode video frames correctly if the video width is not divisible by 8 #6600

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
w238liu opened this issue Sep 16, 2022 · 1 comment
Assignees

Comments

@w238liu
Copy link

w238liu commented Sep 16, 2022

🐛 Describe the bug

VideoReader could not decode video frames correctly if the video width is not divisible by 8. Specifically, the pixels in the last columns of each frame could not be decoded correctly. Please use the code below and the data available here to reproduce the bug.

import os
from PIL import Image
from numpy.typing import NDArray
import numpy as np
import torch
from torchvision.io import VideoReader as VR


input_file = 'short_snippet_828x480.mp4'

vr = VR(input_file, stream='video', num_threads=os.cpu_count())
for idx, data in enumerate(vr):
    print(idx)
    rgb_frame: torch.Tensor = data['data']
    rgb_arr: NDArray = rgb_frame.numpy()
    rgb_arr = np.transpose(rgb_arr, [1, 2, 0])

    im_vr = Image.fromarray(rgb_arr)
    im_vr.save('short_828x480_vr_frames/frame{:0>4d}.png'.format(idx))

Expected behavior:
The python code above should decode video frames and save them in short_828x480_vr_frames. However, if we look at the saved images, we can see that pixels in the last columns of each frame are not decoded. To compare, we can also decode the video using ffmpeg cli directly, i.e. running
ffmpeg -i short_828x480.mp4 -pix_fmt rgb24 short_828x480_ffmpeg_frames/frame%04d.png.
By inspecting the saved images in the folder short_828x480_ffmpeg_frames, we can see that the ffmpeg cli is able to decode the pixels in the last few columns correctly.

Versions

Collecting environment information...
PyTorch version: 1.12.1
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.5 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.31

Python version: 3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.15.0-46-generic-x86_64-with-glibc2.17
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] mypy==0.910
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.23.1
[pip3] pytorch-lightning==1.7.0
[pip3] torch==1.12.1
[pip3] torchaudio==0.12.1
[pip3] torchmetrics==0.9.3
[pip3] torchqa==0.2.0
[pip3] torchvision==0.13.1
[conda] blas 1.0 mkl
[conda] cudatoolkit 11.3.1 h2bc3f7f_2
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py38h7f8727e_0
[conda] mkl_fft 1.3.1 py38hd3c417c_0
[conda] mkl_random 1.2.2 py38h51133e4_0
[conda] numpy 1.23.1 py38h6c91a56_0
[conda] numpy-base 1.23.1 py38ha15fc14_0
[conda] pytorch 1.12.1 py3.8_cuda11.3_cudnn8.3.2_0 pytorch
[conda] pytorch-lightning 1.7.0 pypi_0 pypi
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torchaudio 0.12.1 py38_cu113 pytorch
[conda] torchmetrics 0.9.3 pypi_0 pypi
[conda] torchqa 0.2.0 dev_0
[conda] torchvision 0.13.1 py38_cu113 pytorch

@bjuncek
Copy link
Contributor

bjuncek commented Feb 28, 2023

This is a duplicate of this issue.
I'll revisit looking into this this week.

edit: found the culprit -- stride computation for sws_scale calls requires width divisible by 4 for YUV422 and 8 YUV420 (which we use). Will require a bit of a think as to how to properly fix that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants