Skip to content
This repository was archived by the owner on Nov 23, 2017. It is now read-only.

Serious performance loss (10 times) when NOT using .drain() #338

Closed
socketpair opened this issue May 2, 2016 · 8 comments
Closed

Serious performance loss (10 times) when NOT using .drain() #338

socketpair opened this issue May 2, 2016 · 8 comments

Comments

@socketpair
Copy link

socketpair commented May 2, 2016

Linux Ubuntu 16.04, Python 3.5, standard asyncio library.

#!/usr/bin/env python3.5

import asyncio
import os

async def test_read_arch(rs):
    while True:
        d = await rs.read(65536)
        if not d:
            break

async def test_write_arch(ws):
    b = b'x' * (1024 * 1024)
    for i in range(30):
        ws.write(b)
        # await ws.drain()  # <----------------- uncomment to speed program for 10 times (!)
    await ws.drain() # ensure everything is really written before closing.
    ws.close()

async def connect_write_pipe(file):
    loop = asyncio.get_event_loop()
    transport, protocol = await  loop.connect_write_pipe(asyncio.streams.FlowControlMixin, file)
    stream_writer = asyncio.StreamWriter(transport, protocol, None, loop)
    return stream_writer, transport


async def connect_read_pipe(file):
    loop = asyncio.get_event_loop()
    stream_reader = asyncio.StreamReader()
    transport, protocol = await loop.connect_read_pipe(lambda: asyncio.StreamReaderProtocol(stream_reader), file)
    return stream_reader, transport


async def amain():
    (r, w) = os.pipe()
    stream_reader, rtransport = await connect_read_pipe(open(r, 'rb'))
    stream_writer, wtransport = await connect_write_pipe(open(w, 'wb'))
    await asyncio.gather(
        test_write_arch(stream_writer),
        test_read_arch(stream_reader),
    )
    wtransport.close()
    rtransport.close()

def main():
    loop = asyncio.get_event_loop()
    loop.run_until_complete(amain())


if __name__ == '__main__':
    main()
@socketpair
Copy link
Author

Very strange, .drain() should slow down program since adds pause. But in reality, it greatly speedup program. Maybe because pipe size is less than 1 MB and equal 65536 bytes?

@gvanrossum
Copy link
Member

I can repro on OSX too. Interestingly, if I increase the read() buffer to 1MB I still get the data fed in 64k blocks.

I think there's something quadratic going on in _UnixWritePipeTransport._write_ready(); if I double the data written the time it takes without the drain() call quadruples; with the drain that hardly makes a dent.

I have to stop analyzing this further, but it looks like _write_ready() repeatedly tries to write the entire buffer (which, without the drain() call, starts off at 30 MB). But only 64 KB gets written, and the rest gets pushed back into the buffer. This means a lot of copying of the data happens. Someone should probably try to avoid this behavior.

But in the mean time, this is exactly why drain() was invented, so use it. :-)

@socketpair
Copy link
Author

socketpair commented May 2, 2016

UNIX pipes are limited with 64K of kernel buffers by default, so no one can write at once more than 64K to a pipe.

We should use os.writev() instead of gathering buffers in one big chunk. This is why this sycall was invented. Do you approve that?

drain() was not invented to speedup operations, it was invented to prevent buffer bloat...But in my task buffer bloating is not a problem, so I should acheive good speed even without drain()

@gvanrossum
Copy link
Member

gvanrossum commented May 2, 2016 via email

@socketpair
Copy link
Author

I have created PR. But cannot easily patch tests....

Example:

    @mock.patch('os.write')
    def test_write(self, m_write):
        tr = self.write_pipe_transport()
        m_write.return_value = 4
        tr.write(b'data')
        m_write.assert_called_with(5, b'data')
        self.assertFalse(self.loop.writers)
        self.assertEqual([], tr._buffer)

@socketpair
Copy link
Author

socketpair commented May 2, 2016

Please help me, point how I can write test m_writev.assert_called_with(5, ????)

@gvanrossum
Copy link
Member

gvanrossum commented May 2, 2016 via email

@1st1
Copy link
Member

1st1 commented Sep 15, 2016

I think this issue can be closed since we've merged #385. Please open a new one specifically to discuss vectorized writes in 3.7.

@1st1 1st1 closed this as completed Sep 15, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants