Skip to content

Clarify behaviour when reading past content-length #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Lukasa
Copy link
Contributor

@Lukasa Lukasa commented Feb 11, 2016

One of the proposals from the web-sig: clarify the behaviour of servers when reading past Content-Length, or when Content-Length is not present.

@Lukasa Lukasa force-pushed the read_past_content_length branch from daae608 to a276576 Compare February 11, 2016 14:37
@cdent
Copy link

cdent commented Feb 17, 2016

I'm not super keen on this one as it contains a lot may. Such ambiguity frequently leads to trouble.

This guideline sounds a bit like "obey the content lenght header if you feel like it".

👎 as it stands but I think there's a way to clarify it be a bit more tight.

@Lukasa
Copy link
Contributor Author

Lukasa commented Feb 17, 2016

@GrahamDumpleton may well have some suggestions for alternative wording.

@GrahamDumpleton
Copy link

All the issues around this have been described in detail in item (2) of:

All that the PEP 3333 revision ended up doing was to require a WSGI server always ensure that an empty string was returned as sentinel for end of input. It did not make CONTENT_LENGTH advisory and make use of the empty string alone as end sentinel an acceptable approach. This means that compressed request content and chunked request still cannot be be readily implemented except by stepping outside of the bounds of the WSGI specification.

I know this is not suggesting alternate wording, but possibly need people to understand the requirements for it in the first place before coming up with wording.

@Lukasa
Copy link
Contributor Author

Lukasa commented Feb 18, 2016

@GrahamDumpleton Yeah, I'd read that. Do you believe the current wording addresses that concern?

@benoitc
Copy link

benoitc commented Feb 24, 2016

Shouldn't we make it explicit like HTTP does? Maybe says that the content-length should be "-1" in the case it's not known. Telling the application it should wait for EOF ? .

@Lukasa
Copy link
Contributor Author

Lukasa commented Feb 24, 2016

@benoitc If we sent Content-Length: -1, what would the application do to determine the content length? Presumably it would just read until the EOF condition is simulated. In that case, the nicer thing to do is to allow applications to generally do that, regardless of what content-length says.

In that instance, Content-Length becomes essentially advisory: while it's important to the server for the purposes of framing data, the application can choose to ignore it entirely and allow the server to enforce framing. This is in fact exactly how Content-Length works in HTTP/2: the EOF condition for a stream is set by a specific bit in the framing layer, so Content-Length becomes a validation check rather than an important part of the framing.

This is arguably substantially neater: applications that don't care about how large a request is can choose to just treat all requests like they're coming in via chunked encoding, and just ignore Content-Length altogether.

@benoitc
Copy link

benoitc commented Feb 24, 2016

@Lukasa well if you know the content-length you should at least be advised to take care about it when reading avoiding extra calls to the server or blocking. Not using it should be rather discouraged.

To clarify, the EOF is a a mus have, but IMOO it's a matter of making the absence of know content length explicit. Not sure what should be the correct wording for it.

On the technical purpose a negative length may be a way to tell to the application that the server is receiving a stream. Maybe can e explicited outside though.

@Lukasa
Copy link
Contributor Author

Lukasa commented Feb 24, 2016

So this specific PR concerns itself only with the simulated EOF becoming mandatory, which we all agree on. I have some concerns about how best to provide Content-Length in the absence of having it from the network, but given that PEP 3333 allows CONTENT_LENGTH to be empty or absent, it seems to me that doing that would be the correct way to signal not having it.

@benoitc
Copy link

benoitc commented May 14, 2016

@Lukasa a recent issue in gunicorn (benoitc/gunicorn#1265) let me think that any new spec should enforce the behaviour of the application when a body without content-length is received.

ie the application should fallback to a limited stream when either no content length or chunked request is done. In other case it should stream until EOF imo. I wonder what others think about it.

@rbtcollins
Copy link

I don't think a spec can do that - we can certainly say that to be compliant a server needs to do X or Y or Z, but implementors will vary things :/

@Lukasa
Copy link
Contributor Author

Lukasa commented May 15, 2016

@benoitc I believe that this change is likely to be sufficient in this case: WSGI 1.1 considers CONTENT_LENGTH to be a purely advisory value. Is that not the case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants