Skip to content

Prehistoric codecs/H.263+ entry #416

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Yahweasel opened this issue Dec 1, 2021 · 12 comments
Closed

Prehistoric codecs/H.263+ entry #416

Yahweasel opened this issue Dec 1, 2021 · 12 comments
Labels
registry pertains to new or updated registry entry

Comments

@Yahweasel
Copy link
Contributor

Short version: None of the video codecs in the registry are very approachable with a WebAssembly-compiled software-only encoder. Can we get something a bit more prehistoric, such as H.263+?

Long version:

Let me start with a not-hypothetical: If you're strange (bold?) enough to want to build a telephony app where the WebCodecs implementation might be a WebAssembly-compiled polyfill, what codecs should you use? For audio, it's simple: Opus. Opus is so blazingly fast there's essentially no reason to use anything but a polyfill. For video, the answer is a bit murky.

libvpx, bless its heart, can encode in real time, even when compiled to WebAssembly. But, its real-time mode is essentially "do as much as you can in the time budget, then give up". So, if the time budget is very limiting, the encoding looks awful. Real RealMedia vibes. H.264-era and later codecs are just too sophisticated to expect good quality with real-time encoding in WebAssembly, at least for the time being.

Would it be reasonable to add an earlier video codec to the registry? My proposal/suggestion is H.263v2/H.263+. Admittedly, the selection is purely self-serving—I can easily encode and decode H.263+ in real time in WebAssembly—but I think it's also a reasonable choice.

Some considerations:

  • There is no standard codec tag in MIME types for H.263+. Something would have to be invented, presumably "h263v2", "h263p", or "h263+".
  • Why not MPEG-4 Part 2? Patents. Besides, if real-time is the concern, then telephony is probably the application, and MPEG-4 Part 2's enhancements over H.263+ are mostly geared towards mastered video, not telephony applications. MPEG-4 Part 2 is backwards-compatible, so existing decoders can be used off-the-shelf for H.263+. Probably several of them are tweakable to emit H.263+ as well, but I'm insufficiently familiar to say this with any confidence.
  • Why not plain H.263? Its very limited selection of video frame sizes.
  • Why not even earlier? H.261, MPEG-1 video, H.262 (MPEG-2), and H.263 are all so similar to each other, practical software implementations I'm aware of all perform very similarly, so I just can't find a compelling reason for them.
  • Chrome presumably won't support H.263+. Well, that's what polyfills are for :)
@dalecurtis
Copy link
Contributor

There's no limitation on what codecs can be put in the registry since all codecs are optional to implementers. Please feel free to contribute a registry entry for H.263.

@Yahweasel
Copy link
Contributor Author

Basically, I'm a bit confused/concerned by CONTRIBUTING.md. As I'm not a member of the working group, I must "make a non-member patent licensing commitment", which seems a bit onerous given that I have no patents and don't intend to contribute anything that's under patent...

@dalecurtis
Copy link
Contributor

Hmm, I'm not sure about all that. @tidoust

@Yahweasel
Copy link
Contributor Author

There's no limitation on what codecs can be put in the registry since all codecs are optional to implementers. Please feel free to contribute a registry entry for H.263.

Also, to be incredibly pedantic, the spec does place a (tiny) implementation burden even for unimplemented codecs: the isConfigSupported method is supposed to throw an exception if the codec string is invalid (whether malformed or simply not in the registry), but simply return an object with supported: false if the codec is merely unsupported. To be honest, I find this a bizarre design decision, but regardless, it does have the consequence that adding a codec to the codec registry isn't irrelevant to other implementations.

@sandersdan
Copy link
Contributor

sandersdan commented Dec 1, 2021

Hmm, there is room to improve the spec text in this regard. The intent is that an implementation is free to throw for codecs they do not support at all (I'd expect a TypeError for an unknown enum value), and this follows from the registry being non-normative.

Implementations are expected to return supported: false if they can parse the enum value and could otherwise support the codec except that some particular configuration setting prevents that. FWIW, Chrome's implementation doesn't yet perfectly distinguish these cases and in a few situations can throw for valid codec strings.

@Yahweasel
Copy link
Contributor Author

FWIW, Chrome's implementation doesn't yet perfectly distinguish these cases and in a few situations can throw for valid codec strings.

I know, I submitted a bug about throwing on ulaw and alaw from AudioEncoder ;)

@Yahweasel
Copy link
Contributor Author

Hmm, there is room to improve the spec text in this regard. The intent is that an implementation is free to throw for codecs they do not support at all (I'd expect a TypeError for an unknown enum value), and this follows from the registry being non-normative.

Ahhh, OK, that does follow logically, but it fooled me. Since “A compliant implementation MAY support any combination of codec registrations or none at all.”, a given codec string may not be valid for me, even though it's valid for you, or vice-versa. Subtle.

@tidoust
Copy link
Member

tidoust commented Dec 2, 2021

Basically, I'm a bit confused/concerned by CONTRIBUTING.md. As I'm not a member of the working group, I must "make a non-member patent licensing commitment", which seems a bit onerous given that I have no patents and don't intend to contribute anything that's under patent...

In practice, it all depends on which document you target with your contributions and what your contributions are going to be.

The CONTRIBUTING.md file is a bit too strong in that it was written for a repo that only has one normative specification (in W3C parlance, a spec on the Recommendation track), which is how this repo started. Typically, if you make a normative contribution to the WebCodecs spec, we will ask you to make a non-member patent licensing commitment. That is a protective measure to get confidence that the final spec won't include IP encumbered features that are not covered by licensing commitments.

This repo now also contains the registry and the registrations, which are non-normative documents (in W3C parlance, they are not on the Recommendation track). Contributions to these documents do not require you to sign anything. Similarly, contributions to the WebCodecs spec that are purely editorial (non-normative) are also possible without signing anything.

@Yahweasel
Copy link
Contributor Author

Thanks for the clarification! I'll make a PR then.

@chrisn
Copy link
Member

chrisn commented Jan 27, 2022

Sorry for the delay, @Yahweasel. We're currently seeking input to help review the PR.

One question came up in our discussion: RFC 4629 describes the features added in H.263+ and H.263++ over H.263. The WebCodecs API does not support some of these enhanced capabilities, such as reference picture selection and SNR scalability. Would your implementation support configuration of features such as slice structured mode or independent segment decoding (ISD)? We'd like to understand what H.263 version you're targeting (in #416 (comment) you mention only H.263+), and which of the feature set you would be looking to implement?

@Yahweasel
Copy link
Contributor Author

My polyfill is libavjs-webcodecs-polyfill, and it (sensibly for the name) uses libav.js, which in turn is a port of the libav* libraries from FFmpeg to WebAssembly. So, what I intend to implement is exactly what libav.js implements :). FFmpeg does not support H.263++/H.263v3 at all, only closely related codecs such as MPEG-4 Part 2 and Sorenson Spark. As far as I can tell, there's no way to munge it into perfect H.263++ compatibility.

My purpose for any of this is live chat, and FFmpeg's MPEG-4 Part 2 implementation really isn't geared towards that, while its H.263* codecs are, hence H.263 at all. Its h263p encoder is fast enough for real time even in software, in WebAssembly (though ironically I'm currently having the problem that capturing a frame in, e.g., Firefox takes some 20ms, blowing the budget on capture instead of encoding 🤪)

FFmpeg has two H.263 codecs: h263 and h263p. Their documentation isn't great, because (a) it's hardly the most popular codec in the suite, and (b) the implementations of H.263, H.263v2, MPEG-4 Part 2, Sorenson Spark, and Microsoft MPEG-4 variants are all in the same file with conditions. h263 implements H.263v1, and h263p implements H.263v2. I believe that the following statement is true: h263 implements the entirety of H.263v1, and h263p does not use every annex for encoding, but supports for decoding every annex that does not require external metadata. Therefore, standardizing the condition that a compliant decoder must support anything you can throw at it but a compliant encoder may use any subset of annexes fits FFmpeg's behavior.

@dalecurtis dalecurtis added the registry pertains to new or updated registry entry label Mar 16, 2023
@Yahweasel
Copy link
Contributor Author

(Closing as this doesn't seem necessary and is clearly going nowhere)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
registry pertains to new or updated registry entry
Projects
None yet
Development

No branches or pull requests

5 participants