Skip to content

EmailMessage.get_filename() not unquoting url encodings #117710

@vignesh-arivazhagan

Description

@vignesh-arivazhagan

Bug report

Bug description:

import requests
from email.message import EmailMessage

def get_filename_from_url(url, url_response=None):
    if url_response == None:
        url_response = requests.get(url)
        url_response.raise_for_status()
    
    content_disposition = url_response.headers.get("Content-Disposition")
    if content_disposition:
        email_message = EmailMessage()
        email_message["Content-Disposition"] = content_disposition
        return email_message.get_filename()


url = "https://www.gsi.gov.in/webcenter/ShowProperty;jsessionid=yv2xehEKwHR0ZHf64V2sMbrFzRSeSGCvcxDVr9F4_rbXBVtcgKbl!1598077039!1556223610?nodeId=%2FUCM%2FDCPORT1GSIGOVI063041%2F%2FidcPrimaryFile&revision=latestreleased"
get_filename_from_url(url)

output

annoncement_of%20computer%20application%20_rti_er%20_16122014.pdf

if i use

from urllib.parse import unquote
unquote(email_message.get_filename())

i am getting unquoted output

annoncement_of computer application _rti_er _16122014.pdf

why a different unquote function is used in EmailMessage.get_filename() ?

CPython versions tested on:

CPython main branch

Operating systems tested on:

Windows

Metadata

Metadata

Assignees

No one assigned

    Labels

    type-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions