Skip to content

urllib missing voidresp breaks CacheFTPHandler #81403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hemberger mannequin opened this issue Jun 10, 2019 · 1 comment
Closed

urllib missing voidresp breaks CacheFTPHandler #81403

hemberger mannequin opened this issue Jun 10, 2019 · 1 comment
Labels
3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@hemberger
Copy link
Mannequin

hemberger mannequin commented Jun 10, 2019

BPO 37222
Nosy @orsenthil, @giampaolo, @hemberger
PRs
  • gh-81403: Fix for CacheFTPHandler in urllib #13951
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2019-06-10.23:38:21.930>
    labels = ['type-bug', 'library', '3.9']
    title = 'urllib missing voidresp breaks CacheFTPHandler'
    updated_at = <Date 2019-06-11.08:19:30.970>
    user = 'https://github.com/hemberger'

    bugs.python.org fields:

    activity = <Date 2019-06-11.08:19:30.970>
    actor = 'xtreak'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2019-06-10.23:38:21.930>
    creator = 'danh'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 37222
    keywords = ['patch']
    message_count = 1.0
    messages = ['345155']
    nosy_count = 3.0
    nosy_names = ['orsenthil', 'giampaolo.rodola', 'danh']
    pr_nums = ['13951']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue37222'
    versions = ['Python 3.9']

    Linked PRs

    @hemberger
    Copy link
    Mannequin Author

    hemberger mannequin commented Jun 10, 2019

    When using the CacheFTPHandler in the most basic of contexts, a URLError will be thrown if you try to reuse any of the FTP instances stored in the handler. This makes CacheFTPHandler unusable for its intended purpose. Note that the default FTPHandler circumvents this issue by creating a new ftplib.FTP instance for each connection (and thus never reuses any of them).

    Here is a simple example illustrating the problem:

    """
    import urllib.request as req
    import ftplib

    opener = req.build_opener(req.CacheFTPHandler)
    req.install_opener(opener)
    
    ftplib.FTP.debugging = 2 
    
    for _ in range(2):
        req.urlopen("ftp://www.pythontest.net/README", timeout=10)
    """

    From the ftplib debugging output, we see the following communication between the client and server:

    """
    *cmd* 'TYPE I'
    *put* 'TYPE I\r\n'
    *get* '200 Switching to Binary mode.\n'
    *resp* '200 Switching to Binary mode.'
    *cmd* 'PASV'
    *put* 'PASV\r\n'
    *get* '227 Entering Passive Mode (159,89,235,38,39,111).\n'
    *resp* '227 Entering Passive Mode (159,89,235,38,39,111).'
    *cmd* 'RETR README'
    *put* 'RETR README\r\n'
    *get* '150 Opening BINARY mode data connection for README (123 bytes).\n'
    *resp* '150 Opening BINARY mode data connection for README (123 bytes).'
    *cmd* 'TYPE I'
    *put* 'TYPE I\r\n'
    *get* '226 Transfer complete.\n'
    *resp* '226 Transfer complete.'
    *cmd* 'PASV'
    *put* 'PASV\r\n'
    *get* '200 Switching to Binary mode.\n'
    *resp* '200 Switching to Binary mode.'
    ftplib.error_reply: 200 Switching to Binary mode.
    """

    The client and the server have gotten out of sync due to the missing voidresp() call, i.e. the client sends 'Type I' but receives the response from the previous 'RETR README' command. When ftp.voidresp() is added anywhere between after the ftp.ntransfercmd() and before the next command is sent (i.e. by reverting 2d51f68), we see the correct send/receive pattern:

    """
    *cmd* 'TYPE I'
    *put* 'TYPE I\r\n'
    *get* '200 Switching to Binary mode.\n'
    *resp* '200 Switching to Binary mode.'
    *cmd* 'PASV'
    *put* 'PASV\r\n'
    *get* '227 Entering Passive Mode (159,89,235,38,39,107).\n'
    *resp* '227 Entering Passive Mode (159,89,235,38,39,107).'
    *cmd* 'RETR README'
    *put* 'RETR README\r\n'
    *get* '150 Opening BINARY mode data connection for README (123 bytes).\n'
    *resp* '150 Opening BINARY mode data connection for README (123 bytes).'
    *get* '226 Transfer complete.\n'
    *resp* '226 Transfer complete.'
    *cmd* 'TYPE I'
    *put* 'TYPE I\r\n'
    *get* '200 Switching to Binary mode.\n'
    *resp* '200 Switching to Binary mode.'
    *cmd* 'PASV'
    *put* 'PASV\r\n'
    *get* '227 Entering Passive Mode (159,89,235,38,39,107).\n'
    *resp* '227 Entering Passive Mode (159,89,235,38,39,107).'
    *cmd* 'RETR README'
    *put* 'RETR README\r\n'
    *get* '150 Opening BINARY mode data connection for README (123 bytes).\n'
    *resp* '150 Opening BINARY mode data connection for README (123 bytes).'
    *get* '226 Transfer complete.\n'
    *resp* '226 Transfer complete.'
    """

    By inspecting the methods of ftplib.FTP, we can see that every use of ntransfercmd() is followed by a voidresp(), see e.g. retrbinary, retrlines, storbinary, storlines, and voidcmd.

    I hope that some experts in urllib and ftplib can weigh in on any of the subtleties of this issue, but I think it's clear that the missing ftp.voidresp() call is a significant bug.

    --------------------------------------
    Some historical notes about this issue
    --------------------------------------

    This issue has been documented in a number of other bug reports, but I don't think any have addressed the complete breakage of the CachedFTPHandler that it causes.

    The breaking change was originally introduced as a resolution to bpo-16270. However, it's not clear from the comments why it was believed that removing ftp.voidresp() from endtransfer() was the correct solution. In either case, with this commit reverted, both the test outlined in this report and in bpo-16270 work correctly.

    @orsenthil has suggested this fix (to revert the change to endtransfer) in msg286020, and has explained his reasoning in detail in msg286016.

    @Ivan.Pozdeev has also explained this issue in msg282797 and provided a similar patch in bpo-28931, though it does more than just revert the breaking commit and I'm not sure what the additional changes are intending to fix.

    @hemberger hemberger mannequin added 3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jun 10, 2019
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    orsenthil added a commit that referenced this issue Apr 23, 2023
    bpo-37222: Fix for CacheFTPHandler in urllib
    
    A call to FTP.ntransfercmd must be followed by FTP.voidresp to clear
    the "end transfer" message. Without this, the client and server get
    out of sync, which will result in an error if the FTP instance is
    reused to open a second URL. This scenario occurs for even the most
    basic usage of CacheFTPHandler.
    
    Reverts the patch merged as a resolution to bpo-16270 and adds a test
    case for the CacheFTPHandler in test_urllib2net.py.
    
    Co-authored-by: Senthil Kumaran <[email protected]>
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Apr 23, 2023
    bpo-37222: Fix for CacheFTPHandler in urllib
    
    A call to FTP.ntransfercmd must be followed by FTP.voidresp to clear
    the "end transfer" message. Without this, the client and server get
    out of sync, which will result in an error if the FTP instance is
    reused to open a second URL. This scenario occurs for even the most
    basic usage of CacheFTPHandler.
    
    Reverts the patch merged as a resolution to bpo-16270 and adds a test
    case for the CacheFTPHandler in test_urllib2net.py.
    
    (cherry picked from commit e38bebb)
    
    Co-authored-by: Dan Hemberger <[email protected]>
    Co-authored-by: Senthil Kumaran <[email protected]>
    orsenthil added a commit that referenced this issue Apr 23, 2023
    * gh-81403: Fix for CacheFTPHandler in urllib (GH-13951)
    
    bpo-37222: Fix for CacheFTPHandler in urllib
    
    A call to FTP.ntransfercmd must be followed by FTP.voidresp to clear
    the "end transfer" message. Without this, the client and server get
    out of sync, which will result in an error if the FTP instance is
    reused to open a second URL. This scenario occurs for even the most
    basic usage of CacheFTPHandler.
    
    Reverts the patch merged as a resolution to bpo-16270 and adds a test
    case for the CacheFTPHandler in test_urllib2net.py.
    
    (cherry picked from commit e38bebb)
    
    Co-authored-by: Dan Hemberger <[email protected]>
    Co-authored-by: Senthil Kumaran <[email protected]>
    
    * Added NEWS entry.
    
    ---------
    
    Co-authored-by: Dan Hemberger <[email protected]>
    Co-authored-by: Senthil Kumaran <[email protected]>
    orsenthil added a commit that referenced this issue Apr 23, 2023
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant