`chunk_size` for `iter_lines()` versus `iter_content()` #597

mponton · 2012-05-08T15:05:09Z

Seriously, I think you should merge this one. The latest release and the develop branch still have the default chunk_size "reversed" for iter_content() and iter_lines() contrarilly to what you mentioned here: https://github.com/kennethreitz/requests/pull/589#issuecomment-5567553

This pull request may be canned by Travis CI again, but as I mentioned in the above pull request, the issue is located either in the test logic or the way lines ending in \r\n in iter_lines() depending how you interpret the issue... (I think the later is the real problem as even without changing chunk_size, iter_lines() could return different results for the same input depending on where the line-ending ends inside the buffer read by iter_content()). Not knowing how you perceive the issue, I did not attempt to fix it.

----- Original idea:

Following discussion about issue #539 Kenneth agreed that a default
chunk_size of 1 for iter_lines() would make more sense, but
apparently changed chunk_size for iter_content() by mistake
instead, which did solve the iter_lines() issue but also changed the
default iter_content() chunk to 1 bytes which is not desirable.

Following discussion about issue #539 Kenneth agreed that a default `chunk_size` of 1 for `iter_lines()` would make more sense, but apparently changed `chunk_size` for `iter_content()` by mistake instead, which did solve the `iter_lines()` issue but also changed the default `iter_content()` chunk to 1 bytes which is not desirable.

travisbot · 2012-05-08T15:11:57Z

This pull request fails (merged 87f0a92 into 0308a28).

mponton · 2012-05-08T15:25:44Z

As expected. See comments above and for my previous pull request here: https://github.com/kennethreitz/requests/pull/589#issuecomment-5538019

berg · 2012-07-02T22:49:48Z

I agree with @mponton — the current default chunk_size value of 1 for iter_content() seems broken. We were bitten by this performance regression when upgrading to 0.13.2.

rboulton · 2012-08-14T09:34:56Z

Did this get merged, or ignored? Would have been useful to have had a comment made when the pull request was closed.

Looks like it was ignored, which ... means I had a massive performance regression just now.

kennethreitz · 2012-08-14T18:10:15Z

Duly noted but 'ignored' for now. Definitely needs to change, but we need to find a good answer to the issue noted above first.

slingamn · 2012-08-15T09:19:37Z

I don't entirely know what I'm talking about, but here are some guesses about how iter_content should maybe work:

The core use case for this feature seems to be streaming APIs like Twitter's "firehose", which use HTTP 1.1 and chunked encoding
As per Twitter's guidelines [https://dev.twitter.com/docs/streaming-apis/processing], the right way to consume an API like this is to use chunked encoding to tell how much data to read at a time. They suggest deferring the details of this to the underlying HTTP library.
Unfortunately, it looks like we can't reuse the chunked encoding implementation from httplib, because it's hard-wired to consume the entire document at once and return it. We might need to make our own --- possibly by copying and pasting code from httplib?
For HTTP 1.0 and non-chunked pages, we should probably just try to read the entire document and yield it as a single string?

As mentioned at #775, I am also interested in changing the iter_content behavior so that if the body has already been consumed, it yields a one-item iterator containing the whole body, e.g., iter([self.body]).

kennethreitz closed this Jul 27, 2012

slingamn mentioned this pull request Aug 16, 2012

permissive implementation of iter_content #781

Closed

slingamn mentioned this pull request Sep 7, 2012

umbrella ticket to resolve iteration / read size / chunked encoding questions #844

Closed

github-actions bot locked as resolved and limited conversation to collaborators Sep 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

`chunk_size` for `iter_lines()` versus `iter_content()` #597

`chunk_size` for `iter_lines()` versus `iter_content()` #597

Uh oh!

mponton commented May 8, 2012

Uh oh!

travisbot commented May 8, 2012

Uh oh!

mponton commented May 8, 2012

Uh oh!

berg commented Jul 2, 2012

Uh oh!

rboulton commented Aug 14, 2012

Uh oh!

kennethreitz commented Aug 14, 2012

Uh oh!

slingamn commented Aug 15, 2012

Uh oh!

Uh oh!

Uh oh!

chunk_size for iter_lines() versus iter_content() #597

chunk_size for iter_lines() versus iter_content() #597

Uh oh!

Conversation

mponton commented May 8, 2012

Uh oh!

travisbot commented May 8, 2012

Uh oh!

mponton commented May 8, 2012

Uh oh!

berg commented Jul 2, 2012

Uh oh!

rboulton commented Aug 14, 2012

Uh oh!

kennethreitz commented Aug 14, 2012

Uh oh!

slingamn commented Aug 15, 2012

Uh oh!

Uh oh!

`chunk_size` for `iter_lines()` versus `iter_content()` #597

`chunk_size` for `iter_lines()` versus `iter_content()` #597