Skip to content

GPU VideoReader not working #5702

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mostafarohani opened this issue Mar 29, 2022 · 9 comments
Open

GPU VideoReader not working #5702

mostafarohani opened this issue Mar 29, 2022 · 9 comments
Assignees

Comments

@mostafarohani
Copy link

🐛 Describe the bug

from torchvision.io import VideoReader
import urllib.request
import matplotlib.pyplot as plt
%matplotlib inline

urllib.request.urlretrieve("https://downloads.videezy.com/system/protected/files/000/004/210/4.mp4", "/tmp/cat.mp4")

used_timestamps = sorted(np.random.choice(np.arange(0, 8, 0.1), 10, replace=False).tolist())

video_reader_cpu = VideoReader("/tmp/cat.mp4", device="cpu", num_threads=4)


images_cpu = []
for seek in used_timestamps:
    video_reader_cpu.seek(seek)
    frame = next(video_reader_cpu)
    images_cpu.append(frame["data"].permute(1,2,0))

    
video_reader_gpu = VideoReader("/tmp/cat.mp4", device="cuda")

images_gpu = []
for seek in used_timestamps:
    video_reader_gpu.seek(seek)
    frame = next(video_reader_gpu)
    images_gpu.append(frame["data"].cpu())

for i1, i2 in zip(images_cpu, images_gpu):
    plt.figure()
    plt.subplot(121).imshow(i1)
    plt.subplot(122).imshow(i2)

Screen Shot 2022-03-28 at 10 59 44 PM

Screen Shot 2022-03-28 at 10 59 56 PM

When seeking to specific timestamps in the video and trying to extract the closest image frames, the cpu implementation of VideoReader works exactly as expected. However, the gpu implementation outputs progressively more corrupted versions of a single frame, with halo effects of other frames getting more prevalent in the latter frames.

Versions

Collecting environment information...
PyTorch version: 1.11.0
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.6 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.27

Python version: 3.8.12 (default, Oct 12 2021, 13:49:34)  [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-4.15.0-163-generic-x86_64-with-glibc2.17
Is CUDA available: True
CUDA runtime version: 11.3.109
GPU models and configuration:
GPU 0: NVIDIA RTX A5000
GPU 1: NVIDIA RTX A5000
GPU 2: NVIDIA RTX A5000
GPU 3: NVIDIA RTX A5000
GPU 4: NVIDIA RTX A5000
GPU 5: NVIDIA RTX A5000
GPU 6: NVIDIA RTX A5000
GPU 7: NVIDIA RTX A5000

Nvidia driver version: 495.29.05
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.2.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.2.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.2.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.2.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.2.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.2.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.2.0
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] mypy==0.931
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.21.2
[pip3] pytorch-lightning==1.6.0rc1
[pip3] torch==1.11.0
[pip3] torchelastic==0.2.2
[pip3] torchmetrics==0.7.3
[pip3] torchtext==0.12.0
[pip3] torchvision==0.13.0a0+1db8795
[conda] blas                      1.0                         mkl
[conda] cudatoolkit               11.3.1               ha36c431_9    nvidia
[conda] ffmpeg                    4.3                  hf484d3e_0    pytorch
[conda] mkl                       2021.4.0           h06a4308_640
[conda] mkl-service               2.4.0            py38h7f8727e_0
[conda] mkl_fft                   1.3.1            py38hd3c417c_0
[conda] mkl_random                1.2.2            py38h51133e4_0
[conda] mypy                      0.931                    pypi_0    pypi
[conda] mypy-extensions           0.4.3                    pypi_0    pypi
[conda] numpy                     1.21.2           py38h20f2e39_0
[conda] numpy-base                1.21.2           py38h79a1101_0
[conda] pytorch                   1.11.0          py3.8_cuda11.3_cudnn8.2.0_0    pytorch
[conda] pytorch-lightning         1.6.0rc1                 pypi_0    pypi
[conda] pytorch-mutex             1.0                        cuda    pytorch
[conda] torchelastic              0.2.2                    pypi_0    pypi
[conda] torchmetrics              0.7.3                    pypi_0    pypi
[conda] torchtext                 0.12.0                     py38    pytorch
[conda] torchvision               0.13.0a0+1db8795          pypi_0    pypi
@albanD albanD transferred this issue from pytorch/pytorch Mar 29, 2022
@DrJimFan
Copy link

DrJimFan commented Apr 3, 2022

This seems to be a very serious bug, as many video learning code relies on the GPU decoding. Could someone please help?

@mostafarohani I'm also curious, how do you manage to compile torchvision with GPU video decoding support? I installed ffmpeg with GPU support and also tried conda install -c conda-forge ffmpeg, but it keeps saying ffmpeg is missing bfs.h and skips installing the GPU decoder.

@bjuncek
Copy link
Contributor

bjuncek commented Apr 4, 2022

@mostafarohani @LinxiFan we're currently assigning all GPU decoding issues as low-pri as there was a shift in priorities and we're a bit short on people. I'll definitely look at this but probably not before the end of the month. Sorry about that :(

@prabhat00155 do you perhaps know what this could be about? If so, maybe I can take a look at it sooner if I have a decent starting point?

I'm also curious, how do you manage to compile torchvision with GPU video decoding support?

@LinxiFan the instructions are here; we're utilising NVC rather than on FFMPEG for hardware-accelerated decoding.

@bjuncek bjuncek added the bug label Apr 4, 2022
@NicolasHug
Copy link
Member

Is our CI actually running the tests for the GPU decoder? Looking at recent CI runs it seems like the tests are always skipped.

@bjuncek
Copy link
Contributor

bjuncek commented Apr 4, 2022

@NicolasHug nope: see #5147

@prabhat00155
Copy link
Contributor

The current implementation of GPU decoder enables you to seek once in the video and read frames from there. If you seek multiple times, the subsequent outputs may be frames closely related to the first seek. The reason being that multiple frames could have been decoded and queued up for return, after seek and demuxing.
The numerical values between CPU and GPU decoders are slightly different due to different colour spaces being used.
Could you compare the results you get after one seek operation only between GPU decoder and pyav and report any differences? Example usage

@fmassa
Copy link
Member

fmassa commented Apr 5, 2022

@prabhat00155 in this case, I would propose to drop all enqueued frames after a seek. Seeking multiple times on the same file is a very common operation, and in its current state the decoder is indeed not really working as expected

@prabhat00155
Copy link
Contributor

@fmassa That makes sense.
@bjuncek Do you have time to make this change and test it(I don't have access to the devgpu setup anymore). Basically, we would have to clear up the contents of decoded_frames every time there is a seek, so that fetch_frame() doesn't return old frames.

@mostafarohani
Copy link
Author

mostafarohani commented Apr 6, 2022

Thanks for the responses everyone, and I agree with @LinxiFan that this is a very serious bug.

In my use case, I can't afford to just fetch all the frames (there are thousands); only a subset of 15-30 are annotated, and the annotations are not spaced evenly apart; thus I need to do the seeking like described above. On the other hand, the CPU version is so slow that it blocks my model training. Our current hack is to export all the frames as jpg as preprocessing and then do model training. The con of course is the dataset size ballooned from 100GB to 5TB, which has caused a lot of issues, like not being able to store all the data on the server's local disk.

A prompt fix on this would be much appreciated <3

@mostafarohani
Copy link
Author

mostafarohani commented Apr 6, 2022

@LinxiFan , to install, what i did was in my dockerfile

FROM pytorch/pytorch:1.11.0-cuda11.3-cudnn8-devel

ENV DEBIAN_FRONTEND=noninteractive
ENV NVIDIA_DRIVER_CAPABILITIES compute,utility,video

COPY Video_Codec_SDK_11.1.5 /tmp/Video_Codec_SDK_11.1.5
RUN git clone https://github.com/pytorch/vision.git ~/torchvision
RUN cd /root/torchvision && \
    TORCHVISION_INCLUDE=/tmp/Video_Codec_SDK_11.1.5/Interface \
    TORCHVISION_LIBRARY=/tmp/Video_Codec_SDK_11.1.5/Lib/linux/stubs/x86_64 \
    TORCH_CUDA_ARCH_LIST=8.6+PTX \
    FORCE_CUDA=1 /opt/conda/bin/python setup.py install
RUN pip uninstall -y torchvision. # this was shadowing the source installed torchvision, uninstalling made the source version visible

i also add -e NVIDIA_DRIVER_CAPABILITIES=compute,utility,video \ in my docker run script

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants