-
Notifications
You must be signed in to change notification settings - Fork 37
Add core support for decoding from Python file-like objects #564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 41 commits
Commits
Show all changes
61 commits
Select commit
Hold shift + click to select a range
30fe734
Remove unused C++ decoder creation
scotts a093003
Add support for decoding from Python file-like objects
scotts 1ca8443
Merge branch 'main' of github.com:pytorch/torchcodec into file_like
scotts 53d0729
Forgot the new file. :/
scotts 6bae172
Lint.
scotts 70a8364
Remove unneded namespace alias.
scotts edce04b
Remove asserts.
scotts 7741ae4
Cleanup pybind ops loading.
scotts 0117a78
Explicitly say _pybind_ops is a module type
scotts 681b9cc
Refactor AVIOContextHolder
scotts 43d6dde
AVIOFileLikeContext refactoring
scotts d301f53
Better comment for AVIOContextHolder.
scotts a76d6a0
Break out AVIOContext stuff into their own header and source files
scotts fa2445e
Lint
scotts f56b259
Merge branch 'main' of github.com:pytorch/torchcodec into file_like
scotts ffdbbfb
Explicit assert on spec object
scotts c7d9df3
Manual exception raising
scotts 5134aff
Undo in order to merge
scotts 330b4d5
Merge branch 'main' of github.com:pytorch/torchcodec into file_like
scotts 7993070
Raise ImportError on spec failure
scotts f4ece88
Print path
scotts 2b4f213
Close paren
scotts 01884b3
Load and importlib
scotts 45342a7
Lint
scotts 3608b50
Add FFmpeg version in exception traceback message
scotts f36d050
Make exception args tuple; refactor visiblity of context stuff
scotts 6819070
Try find_spec
scotts 89c8698
Trying import_module as backup
scotts 59c129f
Using plain _trochcodec_pybind_ops
scotts e3d08e3
Better module loading error reporting
scotts e9a726f
Do both load and dynamic import
scotts 591995f
Support both RawIOBase and BytesIO
scotts 0ff2e69
Use Union instead of pipe
scotts c1555c2
Comments
scotts a3f6b9e
Update src/torchcodec/decoders/_core/AVIOContextHolder.h
scotts 040321a
Update src/torchcodec/decoders/_core/AVIOFileLikeContext.cpp
scotts edbb5e7
Update src/torchcodec/decoders/_core/AVIOContextHolder.h
scotts 7e6667c
Address comments
scotts ca28adc
Lint
scotts ceaa1a6
Test pass on Mac
scotts 89d1a57
Merge branch 'main' of github.com:pytorch/torchcodec into file_like
scotts 9b06e79
Better comments
scotts 3896a70
More comments
scotts e9b6c76
More more comments
scotts d94b97c
Add pybind11 in some workflows
scotts bd598c3
Make sure custom_ops has Python dependencies
scotts bd3ecab
Add pre-build script to wheel building
scotts e7f49c4
Forgot a g
scotts 0f8556a
Add pre-build script to rest of workflows
scotts 66db272
Lint
scotts 4ca294b
Better comments
scotts d280888
Use string_view instead of string for bytes
scotts 52d5a6f
Remove todo
scotts 9f2469e
Avoid negative buffer sizes
scotts 72f4ffa
Better comment
scotts 9e84c98
Update comments
scotts ae9b7b6
Merge branch 'main' of github.com:pytorch/torchcodec into file_like
scotts 5964537
Merge branch 'main' of github.com:pytorch/torchcodec into file_like
scotts 1ab8669
Merge branch 'main' of github.com:pytorch/torchcodec into file_like
scotts b034fff
More generic way to import pybind11
scotts 0ac90ba
Assert origin is there
scotts File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
// Copyright (c) Meta Platforms, Inc. and affiliates. | ||
// All rights reserved. | ||
// | ||
// This source code is licensed under the BSD-style license found in the | ||
// LICENSE file in the root directory of this source tree. | ||
|
||
#include "src/torchcodec/decoders/_core/AVIOBytesContext.h" | ||
#include <torch/types.h> | ||
|
||
namespace facebook::torchcodec { | ||
|
||
AVIOBytesContext::AVIOBytesContext(const void* data, int64_t dataSize) | ||
: dataContext_{static_cast<const uint8_t*>(data), dataSize, 0} { | ||
TORCH_CHECK(data != nullptr, "Video data buffer cannot be nullptr!"); | ||
TORCH_CHECK(dataSize > 0, "Video data size must be positive"); | ||
createAVIOContext(&read, &seek, &dataContext_); | ||
} | ||
|
||
// The signature of this function is defined by FFMPEG. | ||
int AVIOBytesContext::read(void* opaque, uint8_t* buf, int buf_size) { | ||
auto dataContext = static_cast<DataContext*>(opaque); | ||
TORCH_CHECK( | ||
dataContext->current <= dataContext->size, | ||
"Tried to read outside of the buffer: current=", | ||
dataContext->current, | ||
", size=", | ||
dataContext->size); | ||
|
||
buf_size = FFMIN( | ||
buf_size, static_cast<int>(dataContext->size - dataContext->current)); | ||
TORCH_CHECK( | ||
buf_size >= 0, | ||
"Tried to read negative bytes: buf_size=", | ||
buf_size, | ||
", size=", | ||
dataContext->size, | ||
", current=", | ||
dataContext->current); | ||
|
||
if (!buf_size) { | ||
return AVERROR_EOF; | ||
} | ||
memcpy(buf, dataContext->data + dataContext->current, buf_size); | ||
dataContext->current += buf_size; | ||
return buf_size; | ||
} | ||
|
||
// The signature of this function is defined by FFMPEG. | ||
int64_t AVIOBytesContext::seek(void* opaque, int64_t offset, int whence) { | ||
auto dataContext = static_cast<DataContext*>(opaque); | ||
int64_t ret = -1; | ||
|
||
switch (whence) { | ||
case AVSEEK_SIZE: | ||
ret = dataContext->size; | ||
break; | ||
case SEEK_SET: | ||
dataContext->current = offset; | ||
ret = offset; | ||
break; | ||
default: | ||
break; | ||
} | ||
|
||
return ret; | ||
} | ||
|
||
} // namespace facebook::torchcodec | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
// Copyright (c) Meta Platforms, Inc. and affiliates. | ||
// All rights reserved. | ||
// | ||
// This source code is licensed under the BSD-style license found in the | ||
// LICENSE file in the root directory of this source tree. | ||
|
||
#pragma once | ||
|
||
#include "src/torchcodec/decoders/_core/AVIOContextHolder.h" | ||
|
||
namespace facebook::torchcodec { | ||
|
||
// Enables users to pass in the entire video as bytes. Our read and seek | ||
// functions then traverse the bytes in memory. | ||
class AVIOBytesContext : public AVIOContextHolder { | ||
public: | ||
explicit AVIOBytesContext(const void* data, int64_t dataSize); | ||
|
||
private: | ||
struct DataContext { | ||
const uint8_t* data; | ||
int64_t size; | ||
int64_t current; | ||
}; | ||
|
||
static int read(void* opaque, uint8_t* buf, int buf_size); | ||
static int64_t seek(void* opaque, int64_t offset, int whence); | ||
|
||
DataContext dataContext_; | ||
}; | ||
|
||
} // namespace facebook::torchcodec |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
// Copyright (c) Meta Platforms, Inc. and affiliates. | ||
// All rights reserved. | ||
// | ||
// This source code is licensed under the BSD-style license found in the | ||
// LICENSE file in the root directory of this source tree. | ||
|
||
#include "src/torchcodec/decoders/_core/AVIOContextHolder.h" | ||
#include <torch/types.h> | ||
|
||
namespace facebook::torchcodec { | ||
|
||
void AVIOContextHolder::createAVIOContext( | ||
AVIOReadFunction read, | ||
AVIOSeekFunction seek, | ||
void* heldData, | ||
int bufferSize) { | ||
TORCH_CHECK( | ||
bufferSize > 0, | ||
"Buffer size must be greater than 0; is " + std::to_string(bufferSize)); | ||
auto buffer = static_cast<uint8_t*>(av_malloc(bufferSize)); | ||
TORCH_CHECK( | ||
buffer != nullptr, | ||
"Failed to allocate buffer of size " + std::to_string(bufferSize)); | ||
|
||
avioContext_.reset(avio_alloc_context( | ||
buffer, | ||
bufferSize, | ||
0, | ||
heldData, | ||
read, | ||
nullptr, // write function; not supported yet | ||
seek)); | ||
|
||
if (!avioContext_) { | ||
av_freep(&buffer); | ||
TORCH_CHECK(false, "Failed to allocate AVIOContext"); | ||
} | ||
} | ||
|
||
AVIOContextHolder::~AVIOContextHolder() { | ||
if (avioContext_) { | ||
av_freep(&avioContext_->buffer); | ||
} | ||
} | ||
|
||
AVIOContext* AVIOContextHolder::getAVIOContext() { | ||
return avioContext_.get(); | ||
} | ||
|
||
} // namespace facebook::torchcodec |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
// Copyright (c) Meta Platforms, Inc. and affiliates. | ||
// All rights reserved. | ||
// | ||
// This source code is licensed under the BSD-style license found in the | ||
// LICENSE file in the root directory of this source tree. | ||
|
||
#pragma once | ||
|
||
#include "src/torchcodec/decoders/_core/FFMPEGCommon.h" | ||
|
||
namespace facebook::torchcodec { | ||
|
||
// The AVIOContextHolder serves several purposes: | ||
// | ||
// 1. It is a smart pointer for the AVIOContext. It has the logic to create | ||
// a new AVIOContext and will appropriately free the AVIOContext when it | ||
// goes out of scope. Note that this requires more than just having a | ||
// UniqueAVIOContext, as the AVIOContext points to a buffer which must be | ||
// freed. | ||
// 2. It is a base class for AVIOContext specializations. When specializing a | ||
// AVIOContext, we need to provide four things: | ||
// 1. A read callback function. | ||
// 2. A seek callback function. | ||
// 3. A write callback function. (Not supported yet; it's for encoding.) | ||
// 4. A pointer to some context object that has the same lifetime as the | ||
// AVIOContext itself. This context object holds the custom state that | ||
// tracks the custom behavior of reading, seeking and writing. It is | ||
// provided upon AVIOContext creation and to the read, seek and | ||
// write callback functions. | ||
// While it's not required, it is natural for the derived classes to make | ||
// all of the above members. Base classes need to call | ||
NicolasHug marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// createAVIOContext(), ideally in their constructor. | ||
// 3. A generic handle for those that just need to manage having access to an | ||
// AVIOContext, but aren't necessarily concerned with how it was customized: | ||
// typically, the VideoDecoder. | ||
class AVIOContextHolder { | ||
public: | ||
virtual ~AVIOContextHolder(); | ||
AVIOContext* getAVIOContext(); | ||
|
||
protected: | ||
// Make constructor protected to prevent anyone from constructing | ||
// an AVIOContextHolder without deriving it. (Ordinarily this would be | ||
// enforced by having a pure virtual methods, but we don't have any.) | ||
AVIOContextHolder() = default; | ||
|
||
// These signatures are defined by FFmpeg. | ||
using AVIOReadFunction = int (*)(void*, uint8_t*, int); | ||
using AVIOSeekFunction = int64_t (*)(void*, int64_t, int); | ||
|
||
// Deriving classes should call this function in their constructor. | ||
void createAVIOContext( | ||
AVIOReadFunction read, | ||
AVIOSeekFunction seek, | ||
void* heldData, | ||
int bufferSize = defaultBufferSize); | ||
|
||
private: | ||
UniqueAVIOContext avioContext_; | ||
|
||
// Defaults to 64 KB | ||
static const int defaultBufferSize = 64 * 1024; | ||
}; | ||
|
||
} // namespace facebook::torchcodec |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
// Copyright (c) Meta Platforms, Inc. and affiliates. | ||
// All rights reserved. | ||
// | ||
// This source code is licensed under the BSD-style license found in the | ||
// LICENSE file in the root directory of this source tree. | ||
|
||
#include "src/torchcodec/decoders/_core/AVIOFileLikeContext.h" | ||
#include <torch/types.h> | ||
|
||
namespace facebook::torchcodec { | ||
|
||
AVIOFileLikeContext::AVIOFileLikeContext(py::object fileLike) | ||
: fileLike_{UniquePyObject(new py::object(fileLike))} { | ||
NicolasHug marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{ | ||
// TODO: Is it necessary to acquire the GIL here? Is it maybe even | ||
// harmful? At the moment, this is only called from within a pybind | ||
// function, and pybind guarantees we have the GIL. | ||
py::gil_scoped_acquire gil; | ||
TORCH_CHECK( | ||
py::hasattr(fileLike, "read"), | ||
"File like object must implement a read method."); | ||
TORCH_CHECK( | ||
py::hasattr(fileLike, "seek"), | ||
"File like object must implement a seek method."); | ||
} | ||
createAVIOContext(&read, &seek, &fileLike_); | ||
} | ||
|
||
int AVIOFileLikeContext::read(void* opaque, uint8_t* buf, int buf_size) { | ||
auto fileLike = static_cast<UniquePyObject*>(opaque); | ||
|
||
// Note that we acquire the GIL outside of the loop. This is likely more | ||
// efficient than releasing and acquiring it each loop iteration. | ||
py::gil_scoped_acquire gil; | ||
int num_read = 0; | ||
while (num_read < buf_size) { | ||
int request = buf_size - num_read; | ||
auto chunk = static_cast<std::string>( | ||
static_cast<py::bytes>((*fileLike)->attr("read")(request))); | ||
NicolasHug marked this conversation as resolved.
Show resolved
Hide resolved
|
||
int chunk_len = static_cast<int>(chunk.length()); | ||
if (chunk_len == 0) { | ||
break; | ||
} | ||
TORCH_CHECK( | ||
chunk_len <= request, | ||
"Requested up to ", | ||
request, | ||
" bytes but, received ", | ||
chunk_len, | ||
" bytes. The given object does not conform to read protocol of file object."); | ||
memcpy(buf, chunk.data(), chunk_len); | ||
buf += chunk_len; | ||
num_read += chunk_len; | ||
} | ||
return num_read == 0 ? AVERROR_EOF : num_read; | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Above: this can come later, but we should definitely add a test on an object that can only return a small amount of bytes at once, so as to stress test the |
||
|
||
int64_t AVIOFileLikeContext::seek(void* opaque, int64_t offset, int whence) { | ||
// We do not know the file size. | ||
if (whence == AVSEEK_SIZE) { | ||
return AVERROR(EIO); | ||
} | ||
auto fileLike = static_cast<UniquePyObject*>(opaque); | ||
py::gil_scoped_acquire gil; | ||
return py::cast<int64_t>((*fileLike)->attr("seek")(offset, whence)); | ||
} | ||
|
||
} // namespace facebook::torchcodec |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
// Copyright (c) Meta Platforms, Inc. and affiliates. | ||
// All rights reserved. | ||
// | ||
// This source code is licensed under the BSD-style license found in the | ||
// LICENSE file in the root directory of this source tree. | ||
|
||
#pragma once | ||
|
||
#include <pybind11/pybind11.h> | ||
#include <pybind11/stl.h> | ||
|
||
#include "src/torchcodec/decoders/_core/AVIOContextHolder.h" | ||
|
||
namespace py = pybind11; | ||
|
||
namespace facebook::torchcodec { | ||
|
||
// Enables uers to pass in a Python file-like object. We then forward all read | ||
// and seek calls back up to the methods on the Python object. | ||
class AVIOFileLikeContext : public AVIOContextHolder { | ||
public: | ||
explicit AVIOFileLikeContext(py::object fileLike); | ||
|
||
private: | ||
static int read(void* opaque, uint8_t* buf, int buf_size); | ||
static int64_t seek(void* opaque, int64_t offset, int whence); | ||
|
||
// Note that we dynamically allocate the Python object because we need to | ||
// strictly control when its destructor is called. We must hold the GIL | ||
// when its destructor gets called, as it needs to update the reference | ||
// count. It's easiest to control that when it's dynamic memory. Otherwise, | ||
// we'd have to ensure whatever enclosing scope holds the object has the GIL, | ||
// and that's, at least, hard. For all of the common pitfalls, see: | ||
// | ||
// https://pybind11.readthedocs.io/en/stable/advanced/misc.html#common-sources-of-global-interpreter-lock-errors | ||
// | ||
// We maintain a reference to the file-like object because the file-like | ||
// object that was created on the Python side must live as long as our | ||
// potential use. That is, even if there are no more references to the object | ||
// on the Python side, we require that the object is still live. | ||
struct PyObjectDeleter { | ||
inline void operator()(py::object* obj) const { | ||
if (obj) { | ||
py::gil_scoped_acquire gil; | ||
delete obj; | ||
} | ||
} | ||
}; | ||
NicolasHug marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
using UniquePyObject = std::unique_ptr<py::object, PyObjectDeleter>; | ||
UniquePyObject fileLike_; | ||
}; | ||
NicolasHug marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
} // namespace facebook::torchcodec |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.