Skip to content

http.client.HTTPMessage.getallmatchingheaders() always returns [] #49303

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mwatkins mannequin opened this issue Jan 25, 2009 · 17 comments
Open

http.client.HTTPMessage.getallmatchingheaders() always returns [] #49303

mwatkins mannequin opened this issue Jan 25, 2009 · 17 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@mwatkins
Copy link
Mannequin

mwatkins mannequin commented Jan 25, 2009

BPO 5053
Nosy @orsenthil, @ezio-melotti, @bitdancer, @akheron, @berkerpeksag, @vadmium, @demianbrecht
Dependencies
  • bpo-5054: CGIHTTPRequestHandler.run_cgi() HTTP_ACCEPT improperly parsed
  • Files
  • http.client.py.patch: http.client.py.patch
  • remove-HTTPMessage.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2009-01-25.15:24:32.479>
    labels = ['type-bug', 'library']
    title = 'http.client.HTTPMessage.getallmatchingheaders() always returns []'
    updated_at = <Date 2017-03-19.05:40:00.847>
    user = 'https://bugs.python.org/mwatkins'

    bugs.python.org fields:

    activity = <Date 2017-03-19.05:40:00.847>
    actor = 'v+python'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2009-01-25.15:24:32.479>
    creator = 'mwatkins'
    dependencies = ['5054']
    files = ['12858', '38143']
    hgrepos = []
    issue_num = 5053
    keywords = ['patch']
    message_count = 12.0
    messages = ['80509', '80529', '80603', '80606', '80634', '148202', '164790', '164815', '164826', '235798', '236021', '289839']
    nosy_count = 11.0
    nosy_names = ['ggenellina', 'orsenthil', 'ezio.melotti', 'mwatkins', 'v+python', 'r.david.murray', 'catalin.iacob', 'petri.lehtinen', 'berker.peksag', 'martin.panter', 'demian.brecht']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue5053'
    versions = ['Python 3.4', 'Python 3.5']

    @mwatkins
    Copy link
    Mannequin Author

    mwatkins mannequin commented Jan 25, 2009

    HTTPMessage.getallmatchingheaders() stopped working sometime after
    Python 3.0 release. In a recent (1 day ago) svn update the
    implementation says the method was copied from rfc822.message; the
    Python 3.x implementation is broken (iterates through self.keys instead
    of header values). I've not looked back to see where the change was
    introduced but I do know that it broke post 3.0 release.

    But more importantly than the flaw in this method, the functionality is
    duplicated elsewhere in Python 3.x.

    I propose either deprecating getallmatchingheaders() or if it is to be
    kept for compatibility, the fix can be simply replacing the method body
    with:

    self.get_all(name)

    The docstring for get_all (defined in email.message) affirms that its
    output is indeed compatible with that which was intended for
    getallmatchingheaders().

    get_all(self, name, failobj=None) method of client.HTTPMessage instance
    Return a list of all the values for the named field.

    These will be sorted in the order they appeared in the original
    message, and may contain duplicates.  Any fields deleted and
    re-inserted are always appended to the header list.
    
    If no such fields exist, failobj is returned (defaults to None).
    

    I've tested the use of get_all against one web server (QP) which runs on
    both Python 2.x and 3.x.

    As a result of the broken getallmatchingheaders() the QP web server now
    uses for the same purpose as getallmatchingheaders() instead:
    get_all(name) (defined in email.message.Message in Py3k and
    getheaders(name) (defined in rfc822.Message).

    See also issues on documentation and changed API in bpo-4773, bpo-3428

    @mwatkins mwatkins mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jan 25, 2009
    @mwatkins
    Copy link
    Mannequin Author

    mwatkins mannequin commented Jan 25, 2009

    Trivial patch for http.client attached.

    @mwatkins mwatkins mannequin changed the title http.client.HTTPMessage.getallmatchingheaders() http.client.HTTPMessage.getallmatchingheaders() always returns [] Jan 26, 2009
    @ggenellina
    Copy link
    Mannequin

    ggenellina mannequin commented Jan 27, 2009

    I think unified diffs are preferred.
    Isn't there an existing test for this method?

    @mwatkins
    Copy link
    Mannequin Author

    mwatkins mannequin commented Jan 27, 2009

    Re diffs, noted for the future.
    Re tests:

    # py3k-devel/Lib/test % grep -r getallmatchingheaders *

    ... Returns nothing, so not only does the email package need a test for
    this but so does http.client.

    Incidentally test_mailbox.py has a test for the proposed alternative -
    get_all(), which I noted above. That's another good reason for ridding
    the world of getallmatchingheaders() or at least simply calling
    get_all() from within getallmatchingheaders() if compatibility is a
    legitimate concern.

    @mwatkins
    Copy link
    Mannequin Author

    mwatkins mannequin commented Jan 27, 2009

    Further investigation ( grep -r getallmatchingheaders Lib/* ) reveals
    that in addition to having no tests, and being implemented incorrectly
    in http.client, getallmatchingheaders() is called only once, in
    http.server; that code is also broken (I reported this yesterday in
    bpo-5053).

    Maybe Python 3 is where getallmatchingheaders can make a graceful
    goodbye (and a 2to3 conversion added?).

    @akheron
    Copy link
    Member

    akheron commented Nov 23, 2011

    bpo-13425 was marked as duplicate of this issue.

    @cataliniacob
    Copy link
    Mannequin

    cataliniacob mannequin commented Jul 7, 2012

    So, how to move this further?

    In bpo-13425 Petri proposes 4 alternatives, copying them here:

    1. Document the function to make it officially part of the public API
    2. Rename and move the function to http.server
    3. Leave it undocumented and just fix it
    4. Deprecate the function to be removed in 3.4 or 3.5 and "fix" it to always return [].

    I assume 4) meant:
    4) Deprecate the function to be removed in 3.4 or 3.5 and fix to do what its docstring specifies.

    My proposal is a more explicitly spelled out version 2):
    5) Remove the function, replace its usage in http.server.CGIHTTPRequestHandler and add a test for http.server.CGIHTTPRequestHandler that exercises the part that currently uses getallmatchingheaders since that's obviously broken now.

    The rationale for removal without deprecation is:

    • the function duplicates get_all so there's no reason to have it
    • it's probably not used by any (working) code because it just doesn't work

    Mike can you tell us how you found out about this breakage? Were you using the function? Did you use something else to workaround it since it's broken now?

    Senthil, Petri do you agree with option 5)? If so I can provide a patch.

    @akheron
    Copy link
    Member

    akheron commented Jul 7, 2012

    My 4) actually meant that it should always return []. This is what it currently does, so it could be spelled out clearly in the code.

    IIRC, getallmatchingheaders() cannot be emulated one-to-one using get_all(), because it handles continuation lines differently. That's why I thought removing or deprecating without fixing it would be the best.

    rfc822.Message.getallmatchingheaders() is documented in Python 2, so removing it could make it harder to port code from Python 2 to Python 3. On the other hand, it's broken, so having it removed could actually make things better by not introducing hard-to-find bugs.

    All in all, I'm not sure what's the best thing to do.

    @akheron
    Copy link
    Member

    akheron commented Jul 7, 2012

    The CGIHTTPRequestHandler fix and test would be the best thing to start with, though, as it's not related to the eventual fate of getallmatchingheaders().

    @vadmium
    Copy link
    Member

    vadmium commented Feb 12, 2015

    I agree with Catalin’s proposal to remove it straight away. It just causes confusion, and to me the intended behaviour is not equivalent to Message.get_all(). I would remove the entire HTTPMessage class and replace it as an alias of email.message.Message.

    @vadmium
    Copy link
    Member

    vadmium commented Feb 15, 2015

    Posting a patch to remove the entire HTTPMessage class. This assumes my patch for bpo-5054 is accepted, which will remove the only existing reference to getallmatchingheaders().

    @vpython
    Copy link
    Mannequin

    vpython mannequin commented Mar 19, 2017

    It is certainly true that getallmatchingheaders is broken... because the data it is looking at has changed format.

    Here is a replacement that is as compatible as can be, based on the changed format.

            name = name.lower()
            n = len(name)
            lst = []
            for line, data in self.items():
                if line.lower() == name:
                    lst.append(line + ': ' + data)
            return lst

    The changed format has merged continuation lines, and separated keys and values into a list of duplet tuples. Iterators keys, values, and items exist, keys are not necessarily unique.

    @serhiy-storchaka
    Copy link
    Member

    @vadmium, do you mind to create a PR based on your patch? It needs a NEWS and What's New entries, and perhaps some updates in the stdlib documentation.

    @vadmium
    Copy link
    Member

    vadmium commented May 11, 2025

    Sorry Serhiy I’m not likely to create a pull request. I still think removing the class is the way forward, and I don’t mind if someone else makes a pull request though.

    @picnixz
    Copy link
    Member

    picnixz commented May 11, 2025

    Note that getallmatchingheaders could be removed because it was only used by the CGI handler which I removed in #133811. I kept it in my PR for legacy reasons but maybe we could deprecate it eagerly and remove it?

    @picnixz
    Copy link
    Member

    picnixz commented May 16, 2025

    cc @hugovk Maybe we can use this issue to schedule the deprecation and removal of this function?

    @hugovk
    Copy link
    Member

    hugovk commented May 16, 2025

    I think it'll be clearer to open a new one and close this.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants