Skip to content

PR: Add PyTorch FFmpeg to wheel and conda distributions #2596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 40 commits into from
Oct 6, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
2fd0162
Add PyTorch FFmpeg to wheel and conda distributions
andfoy Aug 19, 2020
00e3050
Try to install wget from conda
andfoy Aug 19, 2020
d3393c1
Add yq flag on Mac
andfoy Aug 19, 2020
ffaa66d
Correct copy instructions
andfoy Aug 19, 2020
e2b97ec
Use cURL on Windows
andfoy Aug 19, 2020
bdd4e2d
Call bzip2 directly due to msys2/MSYS2-packages#1548
andfoy Aug 19, 2020
994b03e
Copy ffmpeg binaries to system-wide directories
andfoy Aug 19, 2020
0276acd
Try to use std:c++17 on Windows
andfoy Aug 19, 2020
85a4e30
Try to define ssize_t on Windows
andfoy Aug 19, 2020
c43c378
Use C++14
andfoy Aug 19, 2020
0d8e0e0
Declare AVRational structs explicitly
andfoy Aug 20, 2020
8522638
Initialize AVRational explicitly
andfoy Aug 20, 2020
6592f5b
Replace macro to prevent errors on Windows
andfoy Aug 20, 2020
4706643
Replace AV_TIME_BASE_Q
andfoy Aug 20, 2020
9e9f894
Add library paths for video extension
andfoy Aug 20, 2020
603b164
Force ffmpeg from pytorch channels?
andfoy Aug 20, 2020
309daa8
Fix clang style warnings
andfoy Aug 20, 2020
e530ada
Update CONDA_CHANNEL_FLAGS
andfoy Aug 20, 2020
3985e04
Fix clang style issues
andfoy Aug 20, 2020
8859151
Update unittest
andfoy Aug 21, 2020
438ffa1
Merge branch 'master' into distribute_ffmpeg
andfoy Sep 16, 2020
291aa61
Merge branch 'master' into distribute_ffmpeg
andfoy Sep 18, 2020
1a7e797
Merge branch 'master' into distribute_ffmpeg
fmassa Sep 22, 2020
ffbc130
Merge branch 'master' into distribute_ffmpeg
andfoy Sep 22, 2020
60ff1ca
Use FFmpeg 4.2
andfoy Sep 22, 2020
1e9ada1
Install correct version on Mac
andfoy Sep 22, 2020
3a1e9d2
Pin av version to 8.0.0
andfoy Sep 23, 2020
d394363
Merge branch 'master' into distribute_ffmpeg
andfoy Sep 23, 2020
1a17871
Fix string formatting issue
andfoy Sep 23, 2020
dadc9fd
Fix pip pinning
andfoy Sep 23, 2020
05aad31
Merge branch 'distribute_ffmpeg' of github.com:andfoy/vision into dis…
andfoy Sep 23, 2020
a2cfab7
Try with 8.0.1
andfoy Sep 23, 2020
71b1262
Merge branch 'master' into distribute_ffmpeg
andfoy Sep 25, 2020
8063d76
Merge branch 'master' into distribute_ffmpeg
andfoy Sep 28, 2020
9cb1366
Use av 8.0.2
andfoy Sep 28, 2020
3e37f4b
Merge branch 'master' into distribute_ffmpeg
andfoy Sep 29, 2020
15a5e0c
Merge branch 'master' into distribute_ffmpeg
andfoy Oct 6, 2020
b4bc676
Remove trailling whitespaces
andfoy Oct 6, 2020
7ac5a2b
Disable test_io_opt.py
andfoy Oct 6, 2020
cf2935f
Disable test_datasets_video_utils
andfoy Oct 6, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .circleci/unittest/linux/scripts/environment.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
channels:
- pytorch
- defaults
dependencies:
- numpy
Expand All @@ -8,6 +9,7 @@ dependencies:
- pip
- libpng
- jpeg
- ffmpeg=4.2
- ca-certificates
- pip:
- future
Expand Down
1 change: 1 addition & 0 deletions .circleci/unittest/windows/scripts/environment.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
channels:
- pytorch
- defaults
dependencies:
- numpy
Expand Down
2 changes: 2 additions & 0 deletions packaging/build_wheel.sh
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ else
cp "/usr/lib64/libjpeg.so" torchvision
fi

download_copy_ffmpeg

if [[ "$OSTYPE" == "msys" ]]; then
IS_WHEEL=1 "$script_dir/windows/internal/vc_env_helper.bat" python setup.py bdist_wheel
else
Expand Down
2 changes: 1 addition & 1 deletion packaging/conda/build_vision.sh
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@ else
fi

if [[ -z "$PYTORCH_VERSION" ]]; then
export CONDA_CHANNEL_FLAGS="-c pytorch-nightly"
export CONDA_CHANNEL_FLAGS="-c pytorch-nightly -c pytorch"
export PYTORCH_VERSION="$(conda search --json 'pytorch[channel=pytorch-nightly]' | \
python -c "import os, sys, json, re; cuver = '$cuver'; \
cuver = cuver.replace('cu', 'cuda') if cuver != 'cpu' else cuver; \
Expand Down
38 changes: 37 additions & 1 deletion packaging/pkg_helpers.bash
Original file line number Diff line number Diff line change
Expand Up @@ -240,7 +240,7 @@ setup_pip_pytorch_version() {
# You MUST have populated PYTORCH_VERSION_SUFFIX before hand.
setup_conda_pytorch_constraint() {
if [[ -z "$PYTORCH_VERSION" ]]; then
export CONDA_CHANNEL_FLAGS="-c pytorch-nightly"
export CONDA_CHANNEL_FLAGS="-c pytorch-nightly -c pytorch"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need pytorch here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think I know the reason. We are using ffmpeg from the pytorch channel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure, this means that if there is a package under pytorch-nightly and pytorch, the one from pytorch-nightly is picked as it comes first in the priority, right?

export PYTORCH_VERSION="$(conda search --json 'pytorch[channel=pytorch-nightly]' | \
python -c "import os, sys, json, re; cuver = os.environ.get('CU_VERSION'); \
cuver_1 = cuver.replace('cu', 'cuda') if cuver != 'cpu' else cuver; \
Expand Down Expand Up @@ -350,3 +350,39 @@ setup_junit_results_folder() {
export CONDA_PYTORCH_BUILD_RESULTS_DIRECTORY="${SOURCE_ROOT_DIR}/build_results/results.xml"
fi
}


download_copy_ffmpeg() {
mkdir ffmpeg_tmp
cd ffmpeg_tmp
if [[ "$OSTYPE" == "msys" ]]; then
# conda install -yq ffmpeg -c pytorch
# curl -L -q https://anaconda.org/pytorch/ffmpeg/4.3/download/win-64/ffmpeg-4.3-ha925a31_0.tar.bz2 --output ffmpeg-4.3-ha925a31_0.tar.bz2
# bzip2 --decompress --stdout ffmpeg-4.3-ha925a31_0.tar.bz2 | tar -x --file=-
# cp Library/bin/*.dll ../torchvision
echo "FFmpeg is disabled currently on Windows"
else
if [[ "$(uname)" == Darwin ]]; then
conda install -yq ffmpeg=4.2 -c pytorch
conda install -yq wget
wget -q https://anaconda.org/pytorch/ffmpeg/4.2/download/osx-64/ffmpeg-4.2-h0a44026_0.tar.bz2
tar -xjvf ffmpeg-4.2-h0a44026_0.tar.bz2
for f in lib/*.dylib; do
if [[ $f =~ ([a-z])+\.dylib ]]; then
cp $f ../torchvision
fi
done
else
wget -q https://anaconda.org/pytorch/ffmpeg/4.2/download/linux-64/ffmpeg-4.2-hf484d3e_0.tar.bz2
tar -xjvf ffmpeg-4.2-hf484d3e_0.tar.bz2
cp lib/*.so ../torchvision
cp -r lib/* /usr/lib
cp -r bin/* /usr/bin
cp -r include/* /usr/include
ldconfig
which ffmpeg
fi
fi
cd ..
rm -rf ffmpeg_tmp
}
2 changes: 2 additions & 0 deletions packaging/torchvision/conda_build_config.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
channel_sources:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we could just get rid of the arguments when we call conda build, right?

- pytorch-nightly,pytorch,defaults
blas_impl:
- mkl # [x86_64]
c_compiler:
Expand Down
4 changes: 3 additions & 1 deletion packaging/torchvision/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ requirements:
- {{ compiler('c') }} # [win]
- libpng
- jpeg
- ffmpeg =4.2 # [not win]

host:
- python
Expand All @@ -21,6 +22,7 @@ requirements:
run:
- python
- libpng
- ffmpeg =4.2 # [not win]
- jpeg
- pillow >=4.1.1
- numpy >=1.11
Expand Down Expand Up @@ -48,7 +50,7 @@ test:
requires:
- pytest
- scipy
- av
- av =8.0.1
- ca-certificates
{{ environ.get('CONDA_TYPING_CONSTRAINT') }}

Expand Down
8 changes: 5 additions & 3 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -337,7 +337,9 @@ def get_extensions():
ffmpeg_bin = os.path.dirname(ffmpeg_exe)
ffmpeg_root = os.path.dirname(ffmpeg_bin)
ffmpeg_include_dir = os.path.join(ffmpeg_root, 'include')
ffmpeg_library_dir = os.path.join(ffmpeg_root, 'lib')
print("ffmpeg include path: {}".format(ffmpeg_include_dir))
print("ffmpeg library_dir: {}".format(ffmpeg_library_dir))

# TorchVision base decoder + video reader
video_reader_src_dir = os.path.join(this_dir, 'torchvision', 'csrc', 'cpu', 'video_reader')
Expand All @@ -360,16 +362,16 @@ def get_extensions():
ffmpeg_include_dir,
extensions_dir,
],
library_dirs=library_dirs,
library_dirs=[ffmpeg_library_dir] + library_dirs,
libraries=[
'avcodec',
'avformat',
'avutil',
'swresample',
'swscale',
],
extra_compile_args=["-std=c++14"],
extra_link_args=["-std=c++14"],
extra_compile_args=["-std=c++14"] if os.name != 'nt' else ['/std:c++14', '/MP'],
extra_link_args=["-std=c++14" if os.name != 'nt' else '/std:c++14'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flag is useless at linking stage for MSVC.

)
)

Expand Down
4 changes: 2 additions & 2 deletions test/test_datasets_video_utils_opt.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
from torchvision import set_video_backend
import test_datasets_video_utils


set_video_backend('video_reader')
# Disabling the video backend switching temporarily
# set_video_backend('video_reader')


if __name__ == '__main__':
Expand Down
3 changes: 2 additions & 1 deletion test/test_io_opt.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
import test_io


set_video_backend('video_reader')
# Disabling the video backend switching temporarily
# set_video_backend('video_reader')


if __name__ == '__main__':
Expand Down
5 changes: 5 additions & 0 deletions torchvision/csrc/cpu/decoder/decoder.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,11 @@
#include "seekable_buffer.h"
#include "stream.h"

#if defined(_MSC_VER)
#include <BaseTsd.h>
typedef SSIZE_T ssize_t;
#endif

namespace ffmpeg {

/**
Expand Down
5 changes: 3 additions & 2 deletions torchvision/csrc/cpu/decoder/stream.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
#include "util.h"

namespace ffmpeg {
const AVRational timeBaseQ = AVRational{1, AV_TIME_BASE};

Stream::Stream(
AVFormatContext* inputCtx,
Expand Down Expand Up @@ -85,7 +86,7 @@ int Stream::openCodec(std::vector<DecoderMetadata>* metadata) {
header.num = steam->time_base.num;
header.den = steam->time_base.den;
header.duration =
av_rescale_q(steam->duration, steam->time_base, AV_TIME_BASE_Q);
av_rescale_q(steam->duration, steam->time_base, timeBaseQ);
metadata->push_back(header);
}

Expand Down Expand Up @@ -238,7 +239,7 @@ void Stream::setFramePts(DecoderHeader* header, bool flush) {
header->pts = av_rescale_q(
header->pts,
inputCtx_->streams[format_.stream]->time_base,
AV_TIME_BASE_Q);
timeBaseQ);
}

switch (format_.type) {
Expand Down
3 changes: 2 additions & 1 deletion torchvision/csrc/cpu/decoder/subtitle_stream.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
#include "util.h"

namespace ffmpeg {
const AVRational timeBaseQ = AVRational{1, AV_TIME_BASE};

SubtitleStream::SubtitleStream(
AVFormatContext* inputCtx,
Expand Down Expand Up @@ -65,7 +66,7 @@ int SubtitleStream::analyzePacket(const AVPacket* packet, bool* gotFrame) {
// set proper pts in us
if (gotFramePtr) {
sub_.pts = av_rescale_q(
pkt.pts, inputCtx_->streams[format_.stream]->time_base, AV_TIME_BASE_Q);
pkt.pts, inputCtx_->streams[format_.stream]->time_base, timeBaseQ);
}

return result;
Expand Down
35 changes: 17 additions & 18 deletions torchvision/csrc/cpu/video_reader/VideoReader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ namespace video_reader {

const AVPixelFormat defaultVideoPixelFormat = AV_PIX_FMT_RGB24;
const AVSampleFormat defaultAudioSampleFormat = AV_SAMPLE_FMT_FLT;
const AVRational timeBaseQ = AVRational{1, AV_TIME_BASE};
const size_t decoderTimeoutMs = 600000;
// A jitter can be added to the end of the range to avoid conversion/rounding
// error, small value 100us won't be enough to select the next frame, but enough
Expand Down Expand Up @@ -99,8 +100,8 @@ size_t fillTensor(
for (size_t i = 0; i < msgs.size(); ++i) {
const auto& msg = msgs[i];
// convert pts into original time_base
AVRational avr = {(int)num, (int)den};
framePtsData[i] = av_rescale_q(msg.header.pts, AV_TIME_BASE_Q, avr);
AVRational avr = AVRational{(int)num, (int)den};
framePtsData[i] = av_rescale_q(msg.header.pts, timeBaseQ, avr);
VLOG(2) << "PTS type: " << sizeof(T) << ", us: " << msg.header.pts
<< ", original: " << framePtsData[i];

Expand Down Expand Up @@ -156,28 +157,26 @@ void offsetsToUs(
videoEndUs = -1;

if (readVideoStream) {
AVRational vr = {(int)videoTimeBaseNum, (int)videoTimeBaseDen};
AVRational vr = AVRational{(int)videoTimeBaseNum, (int)videoTimeBaseDen};
if (videoStartPts > 0) {
videoStartUs = av_rescale_q(videoStartPts, vr, AV_TIME_BASE_Q);
videoStartUs = av_rescale_q(videoStartPts, vr, timeBaseQ);
}
if (videoEndPts > 0) {
// Add jitter to the end of the range to avoid conversion/rounding error.
// Small value 100us won't be enough to select the next frame, but enough
// to compensate rounding error due to the multiple conversions.
videoEndUs =
timeBaseJitterUs + av_rescale_q(videoEndPts, vr, AV_TIME_BASE_Q);
videoEndUs = timeBaseJitterUs + av_rescale_q(videoEndPts, vr, timeBaseQ);
}
} else if (readAudioStream) {
AVRational ar = {(int)audioTimeBaseNum, (int)audioTimeBaseDen};
AVRational ar = AVRational{(int)audioTimeBaseNum, (int)audioTimeBaseDen};
if (audioStartPts > 0) {
videoStartUs = av_rescale_q(audioStartPts, ar, AV_TIME_BASE_Q);
videoStartUs = av_rescale_q(audioStartPts, ar, timeBaseQ);
}
if (audioEndPts > 0) {
// Add jitter to the end of the range to avoid conversion/rounding error.
// Small value 100us won't be enough to select the next frame, but enough
// to compensate rounding error due to the multiple conversions.
videoEndUs =
timeBaseJitterUs + av_rescale_q(audioEndPts, ar, AV_TIME_BASE_Q);
videoEndUs = timeBaseJitterUs + av_rescale_q(audioEndPts, ar, timeBaseQ);
}
}
}
Expand Down Expand Up @@ -336,8 +335,8 @@ torch::List<torch::Tensor> readVideo(

videoDuration = torch::zeros({1}, torch::kLong);
int64_t* videoDurationData = videoDuration.data_ptr<int64_t>();
AVRational vr = {(int)header.num, (int)header.den};
videoDurationData[0] = av_rescale_q(header.duration, AV_TIME_BASE_Q, vr);
AVRational vr = AVRational{(int)header.num, (int)header.den};
videoDurationData[0] = av_rescale_q(header.duration, timeBaseQ, vr);
VLOG(1) << "Video decoding from " << logType << " [" << logMessage
<< "] filled video tensors";
} else {
Expand Down Expand Up @@ -398,8 +397,8 @@ torch::List<torch::Tensor> readVideo(

audioDuration = torch::zeros({1}, torch::kLong);
int64_t* audioDurationData = audioDuration.data_ptr<int64_t>();
AVRational ar = {(int)header.num, (int)header.den};
audioDurationData[0] = av_rescale_q(header.duration, AV_TIME_BASE_Q, ar);
AVRational ar = AVRational{(int)header.num, (int)header.den};
audioDurationData[0] = av_rescale_q(header.duration, timeBaseQ, ar);
VLOG(1) << "Video decoding from " << logType << " [" << logMessage
<< "] filled audio tensors";
} else {
Expand Down Expand Up @@ -598,8 +597,8 @@ torch::List<torch::Tensor> probeVideo(

videoDuration = torch::zeros({1}, torch::kLong);
int64_t* videoDurationData = videoDuration.data_ptr<int64_t>();
AVRational avr = {(int)header.num, (int)header.den};
videoDurationData[0] = av_rescale_q(header.duration, AV_TIME_BASE_Q, avr);
AVRational avr = AVRational{(int)header.num, (int)header.den};
videoDurationData[0] = av_rescale_q(header.duration, timeBaseQ, avr);

VLOG(2) << "Prob fps: " << header.fps << ", duration: " << header.duration
<< ", num: " << header.num << ", den: " << header.den;
Expand Down Expand Up @@ -631,8 +630,8 @@ torch::List<torch::Tensor> probeVideo(

audioDuration = torch::zeros({1}, torch::kLong);
int64_t* audioDurationData = audioDuration.data_ptr<int64_t>();
AVRational avr = {(int)header.num, (int)header.den};
audioDurationData[0] = av_rescale_q(header.duration, AV_TIME_BASE_Q, avr);
AVRational avr = AVRational{(int)header.num, (int)header.den};
audioDurationData[0] = av_rescale_q(header.duration, timeBaseQ, avr);

VLOG(2) << "Prob sample rate: " << format.samples
<< ", duration: " << header.duration << ", num: " << header.num
Expand Down
2 changes: 1 addition & 1 deletion torchvision/io/_video_opt.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ def _validate_pts(pts_range):
assert (
pts_range[0] <= pts_range[1]
), """Start pts should not be smaller than end pts, got
start pts: %d and end pts: %d""" % (
start pts: {0:d} and end pts: {1:d}""".format(
pts_range[0],
pts_range[1],
)
Expand Down