Skip to content

Support CUDA frame in FilterGraph #3183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

mthrok
Copy link
Collaborator

@mthrok mthrok commented Mar 17, 2023

This commit adds CUDA frame support to FilterGraph

It initializes and attaches CUDA frames context to FilterGraph,
so that CUDA frames can be processed in FilterGraph.

As a result, it enables

  1. CUDA filter support such as scale_cuda
  2. Properly retrieve the pixel format coming out of FilterGraph when
    CUDA HW acceleration is enabled. (currently it is reported as "cuda")

Resolves #3159

@facebook-github-bot
Copy link
Contributor

@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

mthrok added a commit to mthrok/audio that referenced this pull request Mar 17, 2023
Summary:
This commit adds CUDA frame support to FilterGraph

It initializes and attaches CUDA frames context to FilterGraph,
so that CUDA frames can be processed in FilterGraph.

As a result, it enables
1. CUDA filter support such as `scale_cuda`
2. Properly retrieve the pixel format coming out of FilterGraph when
   CUDA HW acceleration is enabled. (currently it is reported as "cuda")

Pull Request resolved: pytorch#3183

Differential Revision: D44183722

Pulled By: mthrok

fbshipit-source-id: 9ae9a925df5a5e1770e32917e097a7d03853b6b9
@mthrok mthrok force-pushed the cuda-filter-graph branch from 7766daf to 45c0a25 Compare March 17, 2023 23:28
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44183722

mthrok added a commit to mthrok/audio that referenced this pull request Mar 17, 2023
Summary:
This commit adds CUDA frame support to FilterGraph

It initializes and attaches CUDA frames context to FilterGraph,
so that CUDA frames can be processed in FilterGraph.

As a result, it enables
1. CUDA filter support such as `scale_cuda`
2. Properly retrieve the pixel format coming out of FilterGraph when
   CUDA HW acceleration is enabled. (currently it is reported as "cuda")

Pull Request resolved: pytorch#3183

Differential Revision: D44183722

Pulled By: mthrok

fbshipit-source-id: c4e672ee319ccb1e354d94a7c0d6ddd503d40e7e
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44183722

@mthrok mthrok force-pushed the cuda-filter-graph branch from 45c0a25 to 25a191b Compare March 17, 2023 23:35
@facebook-github-bot
Copy link
Contributor

@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

mthrok added a commit to mthrok/audio that referenced this pull request Mar 19, 2023
Summary:
This commit adds CUDA frame support to FilterGraph

It initializes and attaches CUDA frames context to FilterGraph,
so that CUDA frames can be processed in FilterGraph.

As a result, it enables
1. CUDA filter support such as `scale_cuda`
2. Properly retrieve the pixel format coming out of FilterGraph when
   CUDA HW acceleration is enabled. (currently it is reported as "cuda")

Resolves pytorch#3159

Pull Request resolved: pytorch#3183

Differential Revision: D44183722

Pulled By: mthrok

fbshipit-source-id: 263999172522233401109b9a0d13514883d95660
@mthrok mthrok force-pushed the cuda-filter-graph branch from b0342f5 to 0415a77 Compare March 19, 2023 17:39
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44183722

1 similar comment
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44183722

mthrok added a commit to mthrok/audio that referenced this pull request Mar 19, 2023
Summary:
This commit adds CUDA frame support to FilterGraph

It initializes and attaches CUDA frames context to FilterGraph,
so that CUDA frames can be processed in FilterGraph.

As a result, it enables
1. CUDA filter support such as `scale_cuda`
2. Properly retrieve the pixel format coming out of FilterGraph when
   CUDA HW acceleration is enabled. (currently it is reported as "cuda")

Resolves pytorch#3159

Pull Request resolved: pytorch#3183

Differential Revision: D44183722

Pulled By: mthrok

fbshipit-source-id: ae99c63c770234ec979008a31fcbe661d0265fb3
@mthrok mthrok force-pushed the cuda-filter-graph branch from 0415a77 to ea29cbc Compare March 19, 2023 17:47
@facebook-github-bot
Copy link
Contributor

@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

mthrok added a commit to mthrok/audio that referenced this pull request Mar 19, 2023
Summary:
This commit adds CUDA frame support to FilterGraph

It initializes and attaches CUDA frames context to FilterGraph,
so that CUDA frames can be processed in FilterGraph.

As a result, it enables
1. CUDA filter support such as `scale_cuda`
2. Properly retrieve the pixel format coming out of FilterGraph when
   CUDA HW acceleration is enabled. (currently it is reported as "cuda")

Resolves pytorch#3159

Pull Request resolved: pytorch#3183

Differential Revision: D44183722

Pulled By: mthrok

fbshipit-source-id: 7a1ec4717348965d178045c76b0bbe506140f8c7
@mthrok mthrok force-pushed the cuda-filter-graph branch from 1d2b21c to 2e2fe7a Compare March 19, 2023 22:08
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44183722

@facebook-github-bot
Copy link
Contributor

@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44183722

mthrok added a commit to mthrok/audio that referenced this pull request Mar 20, 2023
Summary:
This commit adds CUDA frame support to FilterGraph

It initializes and attaches CUDA frames context to FilterGraph,
so that CUDA frames can be processed in FilterGraph.

As a result, it enables
1. CUDA filter support such as `scale_cuda`
2. Properly retrieve the pixel format coming out of FilterGraph when
   CUDA HW acceleration is enabled. (currently it is reported as "cuda")

Resolves pytorch#3159

Pull Request resolved: pytorch#3183

Differential Revision: D44183722

Pulled By: mthrok

fbshipit-source-id: 971f796c11a96d728065f84726bdb7acd6e656bc
@mthrok mthrok force-pushed the cuda-filter-graph branch from 4efb0a8 to 187a688 Compare March 20, 2023 04:25
mthrok added a commit to mthrok/audio that referenced this pull request Mar 20, 2023
Summary:
This commit adds CUDA frame support to FilterGraph

It initializes and attaches CUDA frames context to FilterGraph,
so that CUDA frames can be processed in FilterGraph.

As a result, it enables
1. CUDA filter support such as `scale_cuda`
2. Properly retrieve the pixel format coming out of FilterGraph when
   CUDA HW acceleration is enabled. (currently it is reported as "cuda")

Resolves pytorch#3159

Pull Request resolved: pytorch#3183

Differential Revision: D44183722

Pulled By: mthrok

fbshipit-source-id: d319fee3a6c03e1dbc985879f0eead879925b4c8
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44183722

@mthrok mthrok force-pushed the cuda-filter-graph branch from 187a688 to a88ca35 Compare March 20, 2023 04:30
Summary:
This commit adds CUDA frame support to FilterGraph

It initializes and attaches CUDA frames context to FilterGraph,
so that CUDA frames can be processed in FilterGraph.

As a result, it enables
1. CUDA filter support such as `scale_cuda`
2. Properly retrieve the pixel format coming out of FilterGraph when
   CUDA HW acceleration is enabled. (currently it is reported as "cuda")

Resolves pytorch#3159

Pull Request resolved: pytorch#3183

Differential Revision: D44183722

Pulled By: mthrok

fbshipit-source-id: 7f3a223330cc5bc54f99c87c203494c948e9dfba
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D44183722

@mthrok mthrok force-pushed the cuda-filter-graph branch from a88ca35 to 90fc533 Compare March 20, 2023 04:35
mthrok added a commit to mthrok/audio that referenced this pull request Mar 20, 2023
Summary:
This commit adds CUDA frame support to FilterGraph

It initializes and attaches CUDA frames context to FilterGraph,
so that CUDA frames can be processed in FilterGraph.

As a result, it enables
1. CUDA filter support such as `scale_cuda`
2. Properly retrieve the pixel format coming out of FilterGraph when
   CUDA HW acceleration is enabled. (currently it is reported as "cuda")

Resolves pytorch#3159

Pull Request resolved: pytorch#3183

Differential Revision: D44183722

Pulled By: mthrok

fbshipit-source-id: 394c16b2d95d6a741addd17b1284c901ba6a8de6
@facebook-github-bot
Copy link
Contributor

@mthrok merged this pull request in c5b9655.

@github-actions
Copy link

Hey @mthrok.
You merged this PR, but labels were not properly added. Please add a primary and secondary label (See https://github.com/pytorch/audio/blob/main/.github/process_commit.py)

@mthrok mthrok deleted the cuda-filter-graph branch March 20, 2023 15:13
mthrok added a commit to mthrok/audio that referenced this pull request Mar 20, 2023
Summary:
Fix the GPU memory leak introduced in pytorch#3183

The HW frames context is owned by AVCodecContext.
The removed `av_buffer_ref` call increased the ferenrence counting unnecessarily,
and prevented AVCodecContext from feeing the resource.

Reviewed By: nateanl

Differential Revision: D44231876

fbshipit-source-id: 12967a7ed35b4a0a9adf2daf3f1e26db394be779
facebook-github-bot pushed a commit that referenced this pull request Mar 20, 2023
Summary:
Pull Request resolved: #3186

Fix the GPU memory leak introduced in #3183

The HW frames context is owned by AVCodecContext.
The removed `av_buffer_ref` call increased the ferenrence counting unnecessarily,
and prevented AVCodecContext from feeing the resource.

(Note: this ignores all push blocking failures!)

Reviewed By: nateanl

Differential Revision: D44231876

fbshipit-source-id: 9be2c33049dd02a3fa82a85271de7fb62e5b09ea
mthrok added a commit to mthrok/audio that referenced this pull request Mar 21, 2023
Summary:
Refactor the process after decoding in StreamRader.

The post-decode process consists of three parts,
1. preprocessing using FilterGraph
2. conversion to Tensor
3. store in Buffer

The FilterGraph class is a thin wrapper around AVFilterGraph
structure from FFmpeg and it is agnostic to media type. However
Tensor conversion and buffering consists of bunch of different
logics.

Currently, conversion process is abstracted away with
template, i.e. `Buffer<typename Conversion>`, and the whole
process is implemeted in Sink class which consists of `FilterGraph`
and `Buffer` which internally contains Conversion logic, even
though conversion logic and buffer have nothing in common and beter
logically separated.

The new implementation replaces `Sink` class with `IPostDecodeProcess`
interface, which contains the three components.
The different post process is implemented as a template argument of the
actual implementation, i.e.

```c++
template<typename Converter, typename Buffer>
ProcessImpl : IPostDecodeProcess
```

and stored as `unique_ptr<IPostDecodeProcess>` on `StreamProcessor`.
([functionoid pattern](https://isocpp.org/wiki/faq/pointers-to-members#functionoids),
which allos to eliminate all the branching based on the media format.)

Note:
This implementation was not possible at the initial version of
StreamReader, as there was no way of knowing the media attribtues coming out
of `AVFilterGraph`. pytorch#3155 and pytorch#3183 added features to parse it properly,
so we can finally make the post processing strongly-typed.

Differential Revision: D44242647

fbshipit-source-id: 3789ba515bf9de917c94e0a301b67968a1209053
mthrok added a commit to mthrok/audio that referenced this pull request Mar 21, 2023
Summary:
Pull Request resolved: pytorch#3188

Refactor the process after decoding in StreamRader.

The post-decode process consists of three parts,
1. preprocessing using FilterGraph
2. conversion to Tensor
3. store in Buffer

The FilterGraph class is a thin wrapper around AVFilterGraph
structure from FFmpeg and it is agnostic to media type. However
Tensor conversion and buffering consists of bunch of different
logics.

Currently, conversion process is abstracted away with
template, i.e. `Buffer<typename Conversion>`, and the whole
process is implemeted in Sink class which consists of `FilterGraph`
and `Buffer` which internally contains Conversion logic, even
though conversion logic and buffer have nothing in common and beter
logically separated.

The new implementation replaces `Sink` class with `IPostDecodeProcess`
interface, which contains the three components.
The different post process is implemented as a template argument of the
actual implementation, i.e.

```c++
template<typename Converter, typename Buffer>
ProcessImpl : IPostDecodeProcess
```

and stored as `unique_ptr<IPostDecodeProcess>` on `StreamProcessor`.
([functionoid pattern](https://isocpp.org/wiki/faq/pointers-to-members#functionoids),
which allos to eliminate all the branching based on the media format.)

Note:
This implementation was not possible at the initial version of
StreamReader, as there was no way of knowing the media attribtues coming out
of `AVFilterGraph`. pytorch#3155 and pytorch#3183 added features to parse it properly,
so we can finally make the post processing strongly-typed.

Differential Revision: D44242647

fbshipit-source-id: eda4b1b467c71edfad6a5ff11ff91736d5ef8f63
mthrok added a commit to mthrok/audio that referenced this pull request Mar 21, 2023
Summary:
Pull Request resolved: pytorch#3188

Refactor the process after decoding in StreamRader.

The post-decode process consists of three parts,
1. preprocessing using FilterGraph
2. conversion to Tensor
3. store in Buffer

The FilterGraph class is a thin wrapper around AVFilterGraph
structure from FFmpeg and it is agnostic to media type. However
Tensor conversion and buffering consists of bunch of different
logics.

Currently, conversion process is abstracted away with
template, i.e. `Buffer<typename Conversion>`, and the whole
process is implemeted in Sink class which consists of `FilterGraph`
and `Buffer` which internally contains Conversion logic, even
though conversion logic and buffer have nothing in common and beter
logically separated.

The new implementation replaces `Sink` class with `IPostDecodeProcess`
interface, which contains the three components.
The different post process is implemented as a template argument of the
actual implementation, i.e.

```c++
template<typename Converter, typename Buffer>
ProcessImpl : IPostDecodeProcess
```

and stored as `unique_ptr<IPostDecodeProcess>` on `StreamProcessor`.
([functionoid pattern](https://isocpp.org/wiki/faq/pointers-to-members#functionoids), which allows to eliminate all the branching based on the media format.)

Note:
This implementation was not possible at the initial version of
StreamReader, as there was no way of knowing the media attribtues coming out
of `AVFilterGraph`. pytorch#3155 and pytorch#3183
added features to parse it properly, so we can finally make the post processing strongly-typed.

Differential Revision: D44242647

fbshipit-source-id: cba1a2a1425761bfb637e666913b9c9aef2a5cc6
mthrok added a commit to mthrok/audio that referenced this pull request Mar 21, 2023
Summary:
Pull Request resolved: pytorch#3188

Refactor the process after decoding in StreamRader.

The post-decode process consists of three parts,
1. preprocessing using FilterGraph
2. conversion to Tensor
3. store in Buffer

The FilterGraph class is a thin wrapper around AVFilterGraph
structure from FFmpeg and it is agnostic to media type. However
Tensor conversion and buffering consists of bunch of different
logics.

Currently, conversion process is abstracted away with
template, i.e. `Buffer<typename Conversion>`, and the whole
process is implemeted in Sink class which consists of `FilterGraph`
and `Buffer` which internally contains Conversion logic, even
though conversion logic and buffer have nothing in common and beter
logically separated.

The new implementation replaces `Sink` class with `IPostDecodeProcess`
interface, which contains the three components.
The different post process is implemented as a template argument of the
actual implementation, i.e.

```c++
template<typename Converter, typename Buffer>
ProcessImpl : IPostDecodeProcess
```

and stored as `unique_ptr<IPostDecodeProcess>` on `StreamProcessor`.
([functionoid pattern](https://isocpp.org/wiki/faq/pointers-to-members#functionoids), which allows to eliminate all the branching based on the media format.)

Note:
This implementation was not possible at the initial version of
StreamReader, as there was no way of knowing the media attribtues coming out
of `AVFilterGraph`. pytorch#3155 and pytorch#3183
added features to parse it properly, so we can finally make the post processing strongly-typed.

Differential Revision: D44242647

fbshipit-source-id: fa901fbb88b2d0557483b27040ff2c067de02018
facebook-github-bot pushed a commit that referenced this pull request Mar 21, 2023
Summary:
Pull Request resolved: #3188

Refactor the process after decoding in StreamRader.

The post-decode process consists of three parts,
1. preprocessing using FilterGraph
2. conversion to Tensor
3. store in Buffer

The FilterGraph class is a thin wrapper around AVFilterGraph
structure from FFmpeg and it is agnostic to media type. However
Tensor conversion and buffering consists of bunch of different
logics.

Currently, conversion process is abstracted away with
template, i.e. `template<typename Conversion> Buffer`, and the whole
process is implemeted in Sink class which consists of `FilterGraph`
and `Buffer` which internally contains Conversion logic, even
though conversion logic and buffer have nothing in common and beter
logically separated.

The new implementation replaces `Sink` class with `IPostDecodeProcess`
interface, which contains the three components.
The different post process is implemented as a template argument of the
actual implementation, i.e.

```c++
template<typename Converter, typename Buffer>
ProcessImpl : IPostDecodeProcess
```

and stored as `unique_ptr<IPostDecodeProcess>` on `StreamProcessor`.
([functionoid pattern](https://isocpp.org/wiki/faq/pointers-to-members#functionoids), which allows to eliminate all the branching based on the media format.)

Note:
This implementation was not possible at the initial version of
StreamReader, as there was no way of knowing the media attributes coming out
of `AVFilterGraph`. #3155 and #3183
added features to parse it properly, so we can finally make the post processing strongly-typed.

Reviewed By: hwangjeff

Differential Revision: D44242647

fbshipit-source-id: 96b8c6c72a2b8af4fa86a9b02292c65078ee265b
mthrok added a commit to mthrok/audio that referenced this pull request Mar 23, 2023
Summary:
With the support of CUDA filter in pytorch#3183, it is now possible to change the pixel format of CUDA frame.

This commit adds conversion for YUV444P format.

Pull Request resolved: pytorch#3199

Differential Revision: D44323928

Pulled By: mthrok

fbshipit-source-id: e04566af867b4440f7f15c56869368feddf74ba3
mthrok added a commit to mthrok/audio that referenced this pull request Mar 23, 2023
Summary:
With the support of CUDA filter in pytorch#3183, it is now possible to change the pixel format of CUDA frame.

This commit adds conversion for YUV444P format.

Pull Request resolved: pytorch#3199

Differential Revision: D44323928

Pulled By: mthrok

fbshipit-source-id: 4859e36f4dcd4a810d55e02adf21d260643e00ef
facebook-github-bot pushed a commit that referenced this pull request Mar 23, 2023
Summary:
With the support of CUDA filter in #3183, it is now possible to change the pixel format of CUDA frame.

This commit adds conversion for YUV444P format.

Pull Request resolved: #3199

Reviewed By: hwangjeff

Differential Revision: D44323928

Pulled By: mthrok

fbshipit-source-id: 6d9b205e7235df5f21e7d3e06166b3a169f1ae9f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable CUDA filter graph
2 participants