Skip to content

Return RGB frames as output of GPU decoder #5191

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jan 19, 2022

Conversation

prabhat00155
Copy link
Contributor

@prabhat00155 prabhat00155 commented Jan 11, 2022

GPU decoder currently outputs NV12 flattened frames. This PR changes changes the output type to RGB, also fixing the output shape.
Most of this work was done by @fmassa in his draft PR, I made some adjustments when calculating pyav output.
Resolves #5141 and #5145.

@facebook-github-bot
Copy link

facebook-github-bot commented Jan 11, 2022

💊 CI failures summary and remediations

As of commit 7c85da9 (more details on the Dr. CI page):


None of the CI failures appear to be your fault 💚



🚧 1 ongoing upstream failure:

These were probably caused by upstream breakages that are not fixed yet.


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR Prabhat!

I made some comments, let me know what you think

uint8_t* frame_ptr = decoded_frame.data_ptr<uint8_t>();
const uint8_t* const source_arr[] = {
(const uint8_t* const)source_frame,
(const uint8_t* const)(source_frame + source_pitch * ((surface_height + 1) & ~1))};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like us to double-check this (surface_height + 1) & ~1) condition. Can you create some videos with odd dimensions to validate what is actually needed?

Copy link
Contributor Author

@prabhat00155 prabhat00155 Jan 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surface_height is different from luma_height which is directly related to the video dimensions.
I can revisit this when doing code refactoring.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is the +1) & ~1 condition. I'm not sure if it's actually necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

luma height is aligned by 2, so the chroma offset should not be odd(chroma base address can't be odd memory location), hence the alignment.

@fmassa
Copy link
Member

fmassa commented Jan 17, 2022

Also, I think a bunch of things related to output_format like in

return (video_output_format == cudaVideoSurfaceFormat_NV12 ||
video_output_format == cudaVideoSurfaceFormat_P016)
can be cleaned now, right? As we will enforce / assume that the decoder returns NV12, and then we convert it to RGB.

@prabhat00155
Copy link
Contributor Author

Also, I think a bunch of things related to output_format like in

return (video_output_format == cudaVideoSurfaceFormat_NV12 ||
video_output_format == cudaVideoSurfaceFormat_P016)

can be cleaned now, right? As we will enforce / assume that the decoder returns NV12, and then we convert it to RGB.

Yeah, I'll clean these up in a separate PR. There are a few functions in there that may not be needed anymore.

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving to unblock, but I think we need to add tests for odd-sized videos to validate that our implementation handles things correctly

@prabhat00155
Copy link
Contributor Author

I'm approving to unblock, but I think we need to add tests for odd-sized videos to validate that our implementation handles things correctly

I tested it locally using odd-sized videos. We'll could perhaps add some odd-sized videos in torchvision for testing.

@prabhat00155 prabhat00155 merged commit f4fd193 into pytorch:main Jan 19, 2022
@prabhat00155 prabhat00155 deleted the prabhat00155/rgb_kernel branch January 19, 2022 12:28
facebook-github-bot pushed a commit that referenced this pull request Jan 26, 2022
Summary:
* Return RGB frames as output of GPU decoder

* Move clamp to the conversion function

* Cleaned up a bit

* Remove utility functions from test

* Use data member width directly

* Fix linter error

Reviewed By: jdsgomes, prabhat00155

Differential Revision: D33739378

fbshipit-source-id: cea9f49fdefd777ec27a902947531c561686c80c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Output decoded frames in RGB format
4 participants