Skip to content

Add core support for decoding from Python file-like objects #564

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 61 commits into from
Mar 27, 2025
Merged
Show file tree
Hide file tree
Changes from 41 commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
30fe734
Remove unused C++ decoder creation
scotts Mar 7, 2025
a093003
Add support for decoding from Python file-like objects
scotts Mar 14, 2025
1ca8443
Merge branch 'main' of github.com:pytorch/torchcodec into file_like
scotts Mar 14, 2025
53d0729
Forgot the new file. :/
scotts Mar 14, 2025
6bae172
Lint.
scotts Mar 14, 2025
70a8364
Remove unneded namespace alias.
scotts Mar 14, 2025
edce04b
Remove asserts.
scotts Mar 14, 2025
7741ae4
Cleanup pybind ops loading.
scotts Mar 14, 2025
0117a78
Explicitly say _pybind_ops is a module type
scotts Mar 14, 2025
681b9cc
Refactor AVIOContextHolder
scotts Mar 15, 2025
43d6dde
AVIOFileLikeContext refactoring
scotts Mar 15, 2025
d301f53
Better comment for AVIOContextHolder.
scotts Mar 17, 2025
a76d6a0
Break out AVIOContext stuff into their own header and source files
scotts Mar 17, 2025
fa2445e
Lint
scotts Mar 17, 2025
f56b259
Merge branch 'main' of github.com:pytorch/torchcodec into file_like
scotts Mar 17, 2025
ffdbbfb
Explicit assert on spec object
scotts Mar 17, 2025
c7d9df3
Manual exception raising
scotts Mar 17, 2025
5134aff
Undo in order to merge
scotts Mar 17, 2025
330b4d5
Merge branch 'main' of github.com:pytorch/torchcodec into file_like
scotts Mar 17, 2025
7993070
Raise ImportError on spec failure
scotts Mar 17, 2025
f4ece88
Print path
scotts Mar 17, 2025
2b4f213
Close paren
scotts Mar 17, 2025
01884b3
Load and importlib
scotts Mar 17, 2025
45342a7
Lint
scotts Mar 17, 2025
3608b50
Add FFmpeg version in exception traceback message
scotts Mar 18, 2025
f36d050
Make exception args tuple; refactor visiblity of context stuff
scotts Mar 18, 2025
6819070
Try find_spec
scotts Mar 18, 2025
89c8698
Trying import_module as backup
scotts Mar 18, 2025
59c129f
Using plain _trochcodec_pybind_ops
scotts Mar 18, 2025
e3d08e3
Better module loading error reporting
scotts Mar 18, 2025
e9a726f
Do both load and dynamic import
scotts Mar 18, 2025
591995f
Support both RawIOBase and BytesIO
scotts Mar 18, 2025
0ff2e69
Use Union instead of pipe
scotts Mar 18, 2025
c1555c2
Comments
scotts Mar 18, 2025
a3f6b9e
Update src/torchcodec/decoders/_core/AVIOContextHolder.h
scotts Mar 18, 2025
040321a
Update src/torchcodec/decoders/_core/AVIOFileLikeContext.cpp
scotts Mar 18, 2025
edbb5e7
Update src/torchcodec/decoders/_core/AVIOContextHolder.h
scotts Mar 19, 2025
7e6667c
Address comments
scotts Mar 19, 2025
ca28adc
Lint
scotts Mar 19, 2025
ceaa1a6
Test pass on Mac
scotts Mar 20, 2025
89d1a57
Merge branch 'main' of github.com:pytorch/torchcodec into file_like
scotts Mar 20, 2025
9b06e79
Better comments
scotts Mar 20, 2025
3896a70
More comments
scotts Mar 20, 2025
e9b6c76
More more comments
scotts Mar 20, 2025
d94b97c
Add pybind11 in some workflows
scotts Mar 20, 2025
bd598c3
Make sure custom_ops has Python dependencies
scotts Mar 21, 2025
bd3ecab
Add pre-build script to wheel building
scotts Mar 21, 2025
e7f49c4
Forgot a g
scotts Mar 21, 2025
0f8556a
Add pre-build script to rest of workflows
scotts Mar 21, 2025
66db272
Lint
scotts Mar 21, 2025
4ca294b
Better comments
scotts Mar 21, 2025
d280888
Use string_view instead of string for bytes
scotts Mar 21, 2025
52d5a6f
Remove todo
scotts Mar 21, 2025
9f2469e
Avoid negative buffer sizes
scotts Mar 21, 2025
72f4ffa
Better comment
scotts Mar 21, 2025
9e84c98
Update comments
scotts Mar 24, 2025
ae9b7b6
Merge branch 'main' of github.com:pytorch/torchcodec into file_like
scotts Mar 24, 2025
5964537
Merge branch 'main' of github.com:pytorch/torchcodec into file_like
scotts Mar 24, 2025
1ab8669
Merge branch 'main' of github.com:pytorch/torchcodec into file_like
scotts Mar 26, 2025
b034fff
More generic way to import pybind11
scotts Mar 26, 2025
0ac90ba
Assert origin is there
scotts Mar 26, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 10 additions & 9 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ def run(self):
super().run()

def build_extension(self, ext):
"""Call our CMake build system to build libtorchcodec?.so"""
"""Call our CMake build system to build libtorchcodec*.so"""
# Setuptools was designed to build one extension (.so file) at a time,
# calling this method for each Extension object. We're using a
# CMake-based build where all our extensions are built together at once.
Expand Down Expand Up @@ -136,21 +136,22 @@ def copy_extensions_to_source(self):
This is called by setuptools at the end of .run() during editable installs.
"""
self.get_finalized_command("build_py")
extension = ""
extensions = []
if sys.platform == "linux":
extension = "so"
extensions = ["so"]
elif sys.platform == "darwin":
extension = "dylib"
extensions = ["dylib", "so"]
else:
raise NotImplementedError(
"Platforms other than linux/darwin are not supported yet"
)

for so_file in self._install_prefix.glob(f"*.{extension}"):
assert "libtorchcodec" in so_file.name
destination = Path("src/torchcodec/") / so_file.name
print(f"Copying {so_file} to {destination}")
self.copy_file(so_file, destination, level=self.verbose)
for ext in extensions:
for lib_file in self._install_prefix.glob(f"*.{ext}"):
assert "libtorchcodec" in lib_file.name
destination = Path("src/torchcodec/") / lib_file.name
print(f"Copying {lib_file} to {destination}")
self.copy_file(lib_file, destination, level=self.verbose)


NOT_A_LICENSE_VIOLATION_VAR = "I_CONFIRM_THIS_IS_NOT_A_LICENSE_VIOLATION"
Expand Down
68 changes: 68 additions & 0 deletions src/torchcodec/decoders/_core/AVIOBytesContext.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
// Copyright (c) Meta Platforms, Inc. and affiliates.
// All rights reserved.
//
// This source code is licensed under the BSD-style license found in the
// LICENSE file in the root directory of this source tree.

#include "src/torchcodec/decoders/_core/AVIOBytesContext.h"
#include <torch/types.h>

namespace facebook::torchcodec {

AVIOBytesContext::AVIOBytesContext(const void* data, int64_t dataSize)
: dataContext_{static_cast<const uint8_t*>(data), dataSize, 0} {
TORCH_CHECK(data != nullptr, "Video data buffer cannot be nullptr!");
TORCH_CHECK(dataSize > 0, "Video data size must be positive");
createAVIOContext(&read, &seek, &dataContext_);
}

// The signature of this function is defined by FFMPEG.
int AVIOBytesContext::read(void* opaque, uint8_t* buf, int buf_size) {
auto dataContext = static_cast<DataContext*>(opaque);
TORCH_CHECK(
dataContext->current <= dataContext->size,
"Tried to read outside of the buffer: current=",
dataContext->current,
", size=",
dataContext->size);

buf_size = FFMIN(
buf_size, static_cast<int>(dataContext->size - dataContext->current));
TORCH_CHECK(
buf_size >= 0,
"Tried to read negative bytes: buf_size=",
buf_size,
", size=",
dataContext->size,
", current=",
dataContext->current);

if (!buf_size) {
return AVERROR_EOF;
}
memcpy(buf, dataContext->data + dataContext->current, buf_size);
dataContext->current += buf_size;
return buf_size;
}

// The signature of this function is defined by FFMPEG.
int64_t AVIOBytesContext::seek(void* opaque, int64_t offset, int whence) {
auto dataContext = static_cast<DataContext*>(opaque);
int64_t ret = -1;

switch (whence) {
case AVSEEK_SIZE:
ret = dataContext->size;
break;
case SEEK_SET:
dataContext->current = offset;
ret = offset;
break;
default:
break;
}

return ret;
}

} // namespace facebook::torchcodec
32 changes: 32 additions & 0 deletions src/torchcodec/decoders/_core/AVIOBytesContext.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
// Copyright (c) Meta Platforms, Inc. and affiliates.
// All rights reserved.
//
// This source code is licensed under the BSD-style license found in the
// LICENSE file in the root directory of this source tree.

#pragma once

#include "src/torchcodec/decoders/_core/AVIOContextHolder.h"

namespace facebook::torchcodec {

// Enables users to pass in the entire video as bytes. Our read and seek
// functions then traverse the bytes in memory.
class AVIOBytesContext : public AVIOContextHolder {
public:
explicit AVIOBytesContext(const void* data, int64_t dataSize);

private:
struct DataContext {
const uint8_t* data;
int64_t size;
int64_t current;
};

static int read(void* opaque, uint8_t* buf, int buf_size);
static int64_t seek(void* opaque, int64_t offset, int whence);

DataContext dataContext_;
};

} // namespace facebook::torchcodec
50 changes: 50 additions & 0 deletions src/torchcodec/decoders/_core/AVIOContextHolder.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
// Copyright (c) Meta Platforms, Inc. and affiliates.
// All rights reserved.
//
// This source code is licensed under the BSD-style license found in the
// LICENSE file in the root directory of this source tree.

#include "src/torchcodec/decoders/_core/AVIOContextHolder.h"
#include <torch/types.h>

namespace facebook::torchcodec {

void AVIOContextHolder::createAVIOContext(
AVIOReadFunction read,
AVIOSeekFunction seek,
void* heldData,
int bufferSize) {
TORCH_CHECK(
bufferSize > 0,
"Buffer size must be greater than 0; is " + std::to_string(bufferSize));
auto buffer = static_cast<uint8_t*>(av_malloc(bufferSize));
TORCH_CHECK(
buffer != nullptr,
"Failed to allocate buffer of size " + std::to_string(bufferSize));

avioContext_.reset(avio_alloc_context(
buffer,
bufferSize,
0,
heldData,
read,
nullptr, // write function; not supported yet
seek));

if (!avioContext_) {
av_freep(&buffer);
TORCH_CHECK(false, "Failed to allocate AVIOContext");
}
}

AVIOContextHolder::~AVIOContextHolder() {
if (avioContext_) {
av_freep(&avioContext_->buffer);
}
}

AVIOContext* AVIOContextHolder::getAVIOContext() {
return avioContext_.get();
}

} // namespace facebook::torchcodec
65 changes: 65 additions & 0 deletions src/torchcodec/decoders/_core/AVIOContextHolder.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
// Copyright (c) Meta Platforms, Inc. and affiliates.
// All rights reserved.
//
// This source code is licensed under the BSD-style license found in the
// LICENSE file in the root directory of this source tree.

#pragma once

#include "src/torchcodec/decoders/_core/FFMPEGCommon.h"

namespace facebook::torchcodec {

// The AVIOContextHolder serves several purposes:
//
// 1. It is a smart pointer for the AVIOContext. It has the logic to create
// a new AVIOContext and will appropriately free the AVIOContext when it
// goes out of scope. Note that this requires more than just having a
// UniqueAVIOContext, as the AVIOContext points to a buffer which must be
// freed.
// 2. It is a base class for AVIOContext specializations. When specializing a
// AVIOContext, we need to provide four things:
// 1. A read callback function.
// 2. A seek callback function.
// 3. A write callback function. (Not supported yet; it's for encoding.)
// 4. A pointer to some context object that has the same lifetime as the
// AVIOContext itself. This context object holds the custom state that
// tracks the custom behavior of reading, seeking and writing. It is
// provided upon AVIOContext creation and to the read, seek and
// write callback functions.
// While it's not required, it is natural for the derived classes to make
// all of the above members. Base classes need to call
// createAVIOContext(), ideally in their constructor.
// 3. A generic handle for those that just need to manage having access to an
// AVIOContext, but aren't necessarily concerned with how it was customized:
// typically, the VideoDecoder.
class AVIOContextHolder {
public:
virtual ~AVIOContextHolder();
AVIOContext* getAVIOContext();

protected:
// Make constructor protected to prevent anyone from constructing
// an AVIOContextHolder without deriving it. (Ordinarily this would be
// enforced by having a pure virtual methods, but we don't have any.)
AVIOContextHolder() = default;

// These signatures are defined by FFmpeg.
using AVIOReadFunction = int (*)(void*, uint8_t*, int);
using AVIOSeekFunction = int64_t (*)(void*, int64_t, int);

// Deriving classes should call this function in their constructor.
void createAVIOContext(
AVIOReadFunction read,
AVIOSeekFunction seek,
void* heldData,
int bufferSize = defaultBufferSize);

private:
UniqueAVIOContext avioContext_;

// Defaults to 64 KB
static const int defaultBufferSize = 64 * 1024;
};

} // namespace facebook::torchcodec
68 changes: 68 additions & 0 deletions src/torchcodec/decoders/_core/AVIOFileLikeContext.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
// Copyright (c) Meta Platforms, Inc. and affiliates.
// All rights reserved.
//
// This source code is licensed under the BSD-style license found in the
// LICENSE file in the root directory of this source tree.

#include "src/torchcodec/decoders/_core/AVIOFileLikeContext.h"
#include <torch/types.h>

namespace facebook::torchcodec {

AVIOFileLikeContext::AVIOFileLikeContext(py::object fileLike)
: fileLike_{UniquePyObject(new py::object(fileLike))} {
{
// TODO: Is it necessary to acquire the GIL here? Is it maybe even
// harmful? At the moment, this is only called from within a pybind
// function, and pybind guarantees we have the GIL.
py::gil_scoped_acquire gil;
TORCH_CHECK(
py::hasattr(fileLike, "read"),
"File like object must implement a read method.");
TORCH_CHECK(
py::hasattr(fileLike, "seek"),
"File like object must implement a seek method.");
}
createAVIOContext(&read, &seek, &fileLike_);
}

int AVIOFileLikeContext::read(void* opaque, uint8_t* buf, int buf_size) {
auto fileLike = static_cast<UniquePyObject*>(opaque);

// Note that we acquire the GIL outside of the loop. This is likely more
// efficient than releasing and acquiring it each loop iteration.
py::gil_scoped_acquire gil;
int num_read = 0;
while (num_read < buf_size) {
int request = buf_size - num_read;
auto chunk = static_cast<std::string>(
static_cast<py::bytes>((*fileLike)->attr("read")(request)));
int chunk_len = static_cast<int>(chunk.length());
if (chunk_len == 0) {
break;
}
TORCH_CHECK(
chunk_len <= request,
"Requested up to ",
request,
" bytes but, received ",
chunk_len,
" bytes. The given object does not conform to read protocol of file object.");
memcpy(buf, chunk.data(), chunk_len);
buf += chunk_len;
num_read += chunk_len;
}
return num_read == 0 ? AVERROR_EOF : num_read;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above: this can come later, but we should definitely add a test on an object that can only return a small amount of bytes at once, so as to stress test the while logic above.


int64_t AVIOFileLikeContext::seek(void* opaque, int64_t offset, int whence) {
// We do not know the file size.
if (whence == AVSEEK_SIZE) {
return AVERROR(EIO);
}
auto fileLike = static_cast<UniquePyObject*>(opaque);
py::gil_scoped_acquire gil;
return py::cast<int64_t>((*fileLike)->attr("seek")(offset, whence));
}

} // namespace facebook::torchcodec
54 changes: 54 additions & 0 deletions src/torchcodec/decoders/_core/AVIOFileLikeContext.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
// Copyright (c) Meta Platforms, Inc. and affiliates.
// All rights reserved.
//
// This source code is licensed under the BSD-style license found in the
// LICENSE file in the root directory of this source tree.

#pragma once

#include <pybind11/pybind11.h>
#include <pybind11/stl.h>

#include "src/torchcodec/decoders/_core/AVIOContextHolder.h"

namespace py = pybind11;

namespace facebook::torchcodec {

// Enables uers to pass in a Python file-like object. We then forward all read
// and seek calls back up to the methods on the Python object.
class AVIOFileLikeContext : public AVIOContextHolder {
public:
explicit AVIOFileLikeContext(py::object fileLike);

private:
static int read(void* opaque, uint8_t* buf, int buf_size);
static int64_t seek(void* opaque, int64_t offset, int whence);

// Note that we dynamically allocate the Python object because we need to
// strictly control when its destructor is called. We must hold the GIL
// when its destructor gets called, as it needs to update the reference
// count. It's easiest to control that when it's dynamic memory. Otherwise,
// we'd have to ensure whatever enclosing scope holds the object has the GIL,
// and that's, at least, hard. For all of the common pitfalls, see:
//
// https://pybind11.readthedocs.io/en/stable/advanced/misc.html#common-sources-of-global-interpreter-lock-errors
//
// We maintain a reference to the file-like object because the file-like
// object that was created on the Python side must live as long as our
// potential use. That is, even if there are no more references to the object
// on the Python side, we require that the object is still live.
struct PyObjectDeleter {
inline void operator()(py::object* obj) const {
if (obj) {
py::gil_scoped_acquire gil;
delete obj;
}
}
};

using UniquePyObject = std::unique_ptr<py::object, PyObjectDeleter>;
UniquePyObject fileLike_;
};

} // namespace facebook::torchcodec
Loading
Loading