Skip to content

Add optional MediaStreamTrack integration points to encoders and decoders. #266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
chcunningham opened this issue Jun 3, 2021 · 15 comments
Labels
extension Interface changes that extend without breaking. obsolete Looks like it might be obsolete

Comments

@chcunningham
Copy link
Collaborator

In issue #211, koush@ raised the following idea for an option to send outputs directly to canvas (no callback). Splitting that off into its own issue for further consideration.

koush@ wrote:

Incidentally, some of this discussion would be moot if WebCodecs had an option to just let implementers change this:
image

to this:
image

No callback, no message posts. Just an to API render straight to the canvas (which is what we want 99% of the time with a decoder), and the browser can figure out the best way to do that.

@chcunningham chcunningham added the extension Interface changes that extend without breaking. label Jun 3, 2021
@sandersdan
Copy link
Contributor

This proposal makes sense for realtime streams where the only goal is minimizing rendering latency. That's the case for many desktop-streaming cases, but in higher-latency case (including regular video conferencing) a jitter buffer may be preferable. Once you get to regular media playback it does not make sense to work this way.

This is most similar to the bitmaprenderer canvas context, although that has an extra async createImageBitmap() step and therefore could in theory render out-of-order in extreme cases:

let ctx = canvas.getContext("bitmaprenderer");

function output(frame) {
    createImageBitmap(frame).then(bitmap => ctx.transferFromImageBitmap(bitmap));
    frame.close();
}

A 2d context can also draw a VideoFrame without much code:

let ctx = canvas.getContext("2d");

function output(frame) {
    ctx.drawImage(frame, 0, 0);
    frame.close();
}

A canvas may not be the ideal output path for minimizing latency, it might make sense to use a video element instead. That quickly starts to look like the MediaStreamTrackGenerator API which also supports audio. The direct approach proposed originally does have the advantage of knowing for sure that the frame won't be exposed to JS though, which could allow for extra optimizations.

@koush
Copy link

koush commented Jun 4, 2021

@youennf mentioned using media stream tracks as well.

#211 (comment)

This makes me a bit wary because media streams and video don’t have low latency guarantees, and may buffer the data, etc.

@sandersdan
Copy link
Contributor

media streams and video don’t have low latency guarantees, and may buffer the data

This is accurate, the buffering properties of a MediaStreamTrack are not clear. It was designed to support WebRTC so it favors realtime use, though it does include the ability to jitter buffer. The Chrome implementation is realtime, except when used with MediaRecorder.

There is difficulty inherent in integrating MediaStreamTrack with WebCodecs, since WebCodecs supports both realtime and non-realtime uses. MediaStreamTrackGenerator adds Streams to the mix, which is an extra layer where frames can be buffered.

All that said, I would typically expect the buffers to be empty and the latency to be low. The goal is, after all, for MediaStreamTrackGenerator to be widely used by WebRTC apps.

@youennf
Copy link
Contributor

youennf commented Jun 4, 2021

This makes me a bit wary because media streams and video don’t have low latency guarantees, and may buffer the data

I do not think MediaStreamTracks do any buffering in general (it might be worth checking what browsers do by creating a canvas capture track, generating a frame on this track and then assigning the track to a video element without generating new frames).
MediaStreamTrack is basically allowing JS to setup a pipe between a source to a sink. If buffering happens, it will be at the source (say WebCodec decoder) or at the sink (say HTMLVideoElement).
Given the proposals to expose MediaStreamTrack frame access level, buffering and low latency behaviours will become much clearer.

As of jitter buffer, this might be done on raw video frames, which might be memory intensive.
It can also be done on compressed data in which case a MediaStreamTrack is just fine.
No need for video frame access except for more advanced cases like tight synchronization with other data sources.

There is difficulty inherent in integrating MediaStreamTrack with WebCodecs, since WebCodecs supports both realtime and non-realtime uses

I am not really sure of that. Can you describe why that would be difficult?
In any case, it could be an output option as proposed in #199.

@koush
Copy link

koush commented Jun 4, 2021

I do not think MediaStreamTracks do any buffering in general (it might be worth checking what browsers do by creating a canvas capture track, generating a frame on this track and then assigning the track to a video element without generating new frames).

Is this just a convenient byproduct of the implementation or a guarantee by the spec?

@youennf
Copy link
Contributor

youennf commented Jun 4, 2021

Is this just a convenient byproduct of the implementation or a guarantee by the spec?

I think this is the spirit of the spec, especially given its first use is exposing realtime audio data like camera and microphone.
And I would expect consistency in the various implementations (@guidou, @jan-ivar, any thoughts?)

There is not a lot of normative wording given this is not super observable by the web application. One example:

  • The User Agent MUST always play the current data from the MediaStream and MUST NOT buffer.

@sandersdan
Copy link
Contributor

There is difficulty inherent in integrating MediaStreamTrack with WebCodecs, since WebCodecs supports both realtime and non-realtime uses

I am not really sure of that. Can you describe why that would be difficult?

MediaStreamTrack is designed for realtime use, it will drop stale frames when new frames arrive. Apps are never guaranteed to receive all frames from a source, which is unsuitable for non-realtime uses.

Adding Streams, as MediaStreamTrackProcessor does, adds a buffer that operates after stale frame dropping occurs. The result can be that a stale frame is kept alive inside the Stream for a long time while newer frames are dropped. In this case the drop behavior isn't even keeping the "next frame" fresh.

@jan-ivar
Copy link
Member

jan-ivar commented Jun 5, 2021

Yeah MediaStreamTrack is always realtime AFAIK. Maybe you're confusing it with MSE?

Adding Streams, ... adds a buffer

Can you elaborate? My understanding of WHATWG streams is that they impose no buffering on their own. That is: streams can operate by passing (platform) objects around, and the default high-water mark is 1, which translates to no buffering.

@youennf
Copy link
Contributor

youennf commented Jun 7, 2021

it will drop stale frames when new frames arrive

That is currently not observable to the web so it really depends on the source and sink internal behaviors.

On another WebCodec threads, we are discussing the possibility to add a 'realtime' codec setting.
If we have such setting, and we have a video WebCodec decoder as a MediaStreamTrack source, we could define the source behaviour as follows:

  • If WebCodec decoder is 'realtime', generate frames as fast as possible. If sinks cannot keep up, drop outdated frames.
  • If WebCodec decoder is not 'realtime', generate a frame once all the track sinks finished processing the last frame. New frame decoding would be paused/resumed based on the backpressure mechanism @guidou and I discussed elsewhere (which would be the first actual use of backpressure for tracks).
    This behavior can be defined and used whether using WhatWG streams or not to expose individual frames from a MediaStreamTrack.

@dalecurtis
Copy link
Contributor

I don't think we could ever replace all the WebGL code in the first post. At best we could replace a canvas.drawImage() call, but the WebGL stuff has meaning. For these straight-to-canvas use cases, it seems like we instead want to use one of the proposed MediaStream creation mechanisms and pipe them into video:

We could accept either of those in the VideoDecoderInit in place of the output callback as an extension.

@youennf
Copy link
Contributor

youennf commented Jul 29, 2021

I do not see how MediaStreamTrackGenerator or MediaStreamTrackVideoController give benefits here over a MediaStreamTrack, can you clarify this?
The intent in general is that WebCodec decoder output would flow directly from WebCodec decoding thread to the rendering thread without ever hitting WebCodec controlling thread.

As a side bonus, if we design native MediaStreamTrack transforms, we could have WebCodecs -> MediaStreamTrack -> transform -> rendering without any JS interruption.

@dalecurtis dalecurtis changed the title Send VideoDecoder outputs directly to canvas Add optional MediaStream integration points to encoders and decoders. Jul 29, 2021
@dalecurtis
Copy link
Contributor

dalecurtis commented Jul 29, 2021

MediaStreamTrack could also work; I just suggested the JS versions offhand. E.g., one possibility for optional interfaces like this:

MediaStreamTrack (Audio|Video)Decoder.getMediaStreamTrack()
void (Audio|Video)Encoder.configure(MediaStreamTrack)

Keep in mind MediaStreams run afoul of the same issues that streams do though, so we'll want to be careful here:
https://docs.google.com/document/d/10S-p3Ob5snRMjBqpBf5oWn6eYij1vos7cujHoOCCCAw/edit

In general though, we're supportive of optional MediaStreamTrack integration points. I agree they could be very useful for simplifying developers lives and improving performance by decoupling entirely from JS visible threads.

@aboba
Copy link
Collaborator

aboba commented Jul 29, 2021

For MediaStreamTrack integration points to be usable in workers, it will also be necessary to support transfer of MediaStreamTracks. Did you mean getMediaStream() or getMediaStreamTrack() in the interface above?

@dalecurtis
Copy link
Contributor

Thanks, I did mean MediaStreamTrack, since you can always construct a MediaStream from an array of tracks. I've updated the method signature in my comment to match the return value. Ultimately I defer to the MediaStream experts here though.

@dalecurtis dalecurtis changed the title Add optional MediaStream integration points to encoders and decoders. Add optional MediaStreamTrack integration points to encoders and decoders. Jul 29, 2021
@Djuffin Djuffin added the obsolete Looks like it might be obsolete label May 11, 2023
@aboba
Copy link
Collaborator

aboba commented May 8, 2024

@Djuffin @padenot Can we close this?

@Djuffin Djuffin closed this as completed May 8, 2024
@Djuffin Djuffin closed this as not planned Won't fix, can't repro, duplicate, stale May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension Interface changes that extend without breaking. obsolete Looks like it might be obsolete
Projects
None yet
Development

No branches or pull requests

8 participants