-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[Breaking Change Request] Fix HeaderValue parsing, toString(), and support null values #40709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @Hixie @matanlurey @dgrove @vsmenon for review and approval. |
It would also break Accept header value parsing. |
I don't have an objection to this change. I think it's fine to say that HeaderValue is only useful for the subset of headers that have the syntax it expects. In practice each header has its own syntax and so it's perfectly reasonable for different headers to have different semantics here, and no API is going to be able to support all of them regardless. That said, we should probably just deprecate this class entirely. As far as I can tell, it doesn't provide an interface that works for any header value. As noted above, Accept and Sec-WebSocket-Extensions don't work with this because it can't represent missing values in parameter extensions (not to mention that an Accept API should probably expose q weights as an int, should expose the default weight as 1000, etc). But there are lots of other differences. For example, TE headers allow whitespace around the = character of all their parameters except the q= value. The API only allows one parameter value per name, but there's nothing in the specs that says that parameter values in general can't be listed multiple times. The API doesn't reflect the order of the parameters, but in principle the order can be meaningful (Accept does this, for instance, with the "q" parameter separating the media range parameters from the accept extension parameters). There's probably other theoretical cases where this API doesn't quite work. |
@Hixie Note that the Accept header and others are supposed to be parsed by first splitting on the commas, and then parsing each of those using HeaderValue. The HeaderValue class was never able to parse a full Accept header on its own, despite that the documentation seems to imply so. With that in mind, I don't see how this change breaks parsing of the Accept header compared to the current situation? Can you give an example? I do understand the current situation is suboptimal, the question is whether this change breaks anything additional. |
Accept allows the |
No input |
lgtm |
I'm reading the status of this as approved for the breaking change, yes? |
Approved |
Thank you for the discussion. I've discussed the Accept header problem with @lrhn and we're currently intending to amend the proposed behavior. We'll instead embrace valueless parameters as a feature, so it can be used for |
I've amended the proposed behavior per the discussions and implemented it. Please see below: Intended changeThe The empty parameter value without double quotes (which is not allowed by the standards) is now parsed as the empty string rather than Invalid inputs with unbalanced double quotes are now rejected. E.g. Additionally The Likewise the empty value and values using characters outside of RFC 7230 tokens are now correctly implemented by double quoting such values with escape sequences. E.g: HeaderValue("v",
{"a": "A", "b": "(B)", "c": "", "d": "ø", "e": "\\\""}).toString() now gives RationaleThe Expected impactHTTP headers such as
Steps for mitigationClients that sent unbalanced quotes in parameters will need to be fixed. It's not expected that any clients would be doing this as such headers would be severely malformed. Clients that send parameters with empty values without double quotes, and expecting the Dart server to consider that different from no value, will need to be updated to wrap the value in double quotes. Such clients are unlikely as this is not allowed by the standards, although the ImplementationThis change was implemented in https://dart-review.googlesource.com/c/sdk/+/136620. |
The new implementation has passed a G3 global presubmit and has passed review. The remaining task is to decide on the exact behavior if the HeaderValue constructor is called with a null parameters map. Once that's resolved, I will proceed with landing this change. |
This is a breaking change. #40709 This change makes the HeaderValue parsing more strict in two invalid edge cases, supports parameters with null values as a feature, and fixes toString() so it always produces tokens or quoted-strings valid per RFC 7230 3.2.6. The empty parameter value without double quotes (which is not allowed by the standards) is now parsed as the empty string rather than null. E.g. HeaderValue.parse("v;a=").parameters now gives {"a": ""} rather than {"a": null}. Invalid inputs with unbalanced double quotes are now rejected. E.g. HeaderValue.parse('v;a="b').parameters will now throw a HttpException instead of giving {"a": "b"}. The HeaderValue.toString() method now supports parameters with null values by omitting the value. E.g.: HeaderValue("v", {"a": null, "b": "c"}).toString() now gives v; a; b=c This behavior can be used to implement some features in the Accept and Sec-WebSocket-Extensions headers. Likewise the empty value and values using characters outside of RFC 7230 3.2.6 tokens are now correctly implemented by double quoting such values with escape sequences. E.g.: HeaderValue("v", {"a": "A", "b": "(B)", "c": "", "d": "ø", "e": "\\\""}).toString() now gives v;a=A;b="(B)";c="";d="ø";e="\\\"" The NNBD migration required making subtle changes to some dart:io semantics in order to provide a better API. This change backports one of these semantic changes to the unmigrated SDK so any issues can be discovered now instead of blocking the future SDK unfork. Change-Id: Iafc790e03b6290232cac71fe14f995ce0f0b036b Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/136620 Commit-Queue: Jonas Termansen <[email protected]> Reviewed-by: Lasse R.H. Nielsen <[email protected]>
I've landed the breaking change. |
Note: This proposed behavior has been amended, see below for the current proposal
Intended changeTheHeaderValue
class will now parse parameters without a value as having the empty string as value instead of anull
value. This is the class used to parse semicolon delimited parameters used in theAccept
,Authorization
,Content-Type
, and other such HTTP headers. RFC 7231 section 3.1.1.1 and other locations require such parameters to have a value, butHeaderValue
was parsing empty parameters as null value which was wrong according to the specification. TheSec-WebSocket-Extensions
websocket header is a special case that allows absent values. TheHeaderValues
class now parses absent values and empty values as the empty string. For instanceused to give{b: "B", c: null, d: null, e: "E"}
but now gives{b: "B", c: "", d: "", e: "E"}
.AdditionallyHeaderValue.toString()
was incorrectly not quoting the empty parameter values, which wasn't allowed by the specification. I.e.used to givetext/plain; q=
but now givestext/plain; q=""
.RationaleThe non-nullable-by-default migration required making subtle changes to some dart:io semantics in order to provide a better API. TheHeaderValue.parameters
map never contains null values except if parameters not allowed by the standard have been parsed, or in the special case ofSec-WebSocket-Extensions
. To avoid code using these parameters from needing to null check the parsed parameter values, these invalid parameters are instead parsed as the empty string.These absent values also did not survice a roundtrip throughHeaderValue
asHeaderValue.parse('a; b')
used to givea; b=null
where null has no special meaning, but will now givea; b=""
which is closer to the original input. That meansHeaderValue
could not and still cannot be used and to construct theSec-WebSocket-Extensions
parameters that don't have values.Expected impactHTTP headers such asAccept
,Authorization
,Content-Type
that abide by the standard will be unaffected.Code handling theSec-WebSocket-Extensions
usingHeaderValue
may be affected if it doesn't simply check if the parameter is set in the map, but instead handles an empty string differently fromnull
. The relevant code and tests in the SDK has been updated.It's hard to rule out if someone has built other non-standard extensions, where the class has been used for valueless parameters, such asfoo; bar
where the header's value isfoo
with thebar
parameter, which is handled distinctly from if thebar
parameter had a value (foo; bar=qux
). Though theHeaderValue
can only receive such invalid parameters, as it was not able to construct them, instead producingfoo; bar=null
.Steps for mitigationIf code relies onHeaderValue
returns has a null value for valueless parameters, it'll need to be updated to allow the empty string as well. However, if code relies on being able to tell a parameter without a value apart from a parameter with the empty string, thenhead
then theHeaderValue
class can't solve that problem anymore and one will instead need to build a custom parser.Alternatively if this sort of non-standard valueless parameters turns out to exist in practice or there's non-SDK code parsingSec-WebSocket-Extensions
, we can decide not to make this breaking change, and embrace this non-standard behavior as a feature.ImplementationThis change was implemented in https://dart-review.googlesource.com/c/sdk/+/136620.Note: This proposed behavior has been amended, see below for the current proposal
cc @franklinyow @mit-mit @lrhn @athomas
The text was updated successfully, but these errors were encountered: