From 1d845e25ae19d5b317c2d29cc8c056454ca0af3e Mon Sep 17 00:00:00 2001 From: Christian Clauss Date: Sat, 27 Nov 2021 21:26:45 +0100 Subject: [PATCH 1/5] docs: Improve example for urlparse Signed-off-by: Christian Clauss --- Doc/library/urllib.parse.rst | 69 ++++++++++++++++++++---------------- 1 file changed, 39 insertions(+), 30 deletions(-) diff --git a/Doc/library/urllib.parse.rst b/Doc/library/urllib.parse.rst index a060cc9ba7fdd9..473e444aadbf7b 100644 --- a/Doc/library/urllib.parse.rst +++ b/Doc/library/urllib.parse.rst @@ -49,16 +49,26 @@ or on combining URL components into a URL string. present. For example: >>> from urllib.parse import urlparse - >>> o = urlparse('http://www.cwi.nl:80/%7Eguido/Python.html') + >>> urlparse("scheme://netloc/path;parameters?query#fragment") + ParseResult(scheme='scheme', netloc='netloc', path='/path;parameters', params='', + query='query', fragment='fragment') + >>> o = urlparse("http://docs.python.org:80/3/library/urllib.parse.html?" + ... "highlight=params#url-parsing") >>> o # doctest: +NORMALIZE_WHITESPACE - ParseResult(scheme='http', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html', - params='', query='', fragment='') + ParseResult(scheme='http', netloc='docs.python.org:80', + path='/3/library/urllib.parse.html', params='', + query='highlight=params', fragment='url-parsing') >>> o.scheme 'http' + >>> o.netloc + 'docs.python.org:80' + >>> o.hostname + 'docs.python.org' >>> o.port 80 - >>> o.geturl() - 'http://www.cwi.nl:80/%7Eguido/Python.html' + >>> o.geturl() # doctest: +NORMALIZE_WHITESPACE + 'http://docs.python.org:80/3/library/urllib.parse.html?highlight=params# + url-parsing' Following the syntax specifications in :rfc:`1808`, urlparse recognizes a netloc only if it is properly introduced by '//'. Otherwise the @@ -92,31 +102,30 @@ or on combining URL components into a URL string. The return value is a :term:`named tuple`, which means that its items can be accessed by index or as named attributes, which are: - +------------------+-------+--------------------------+----------------------+ - | Attribute | Index | Value | Value if not present | - +==================+=======+==========================+======================+ - | :attr:`scheme` | 0 | URL scheme specifier | *scheme* parameter | - +------------------+-------+--------------------------+----------------------+ - | :attr:`netloc` | 1 | Network location part | empty string | - +------------------+-------+--------------------------+----------------------+ - | :attr:`path` | 2 | Hierarchical path | empty string | - +------------------+-------+--------------------------+----------------------+ - | :attr:`params` | 3 | Parameters for last path | empty string | - | | | element | | - +------------------+-------+--------------------------+----------------------+ - | :attr:`query` | 4 | Query component | empty string | - +------------------+-------+--------------------------+----------------------+ - | :attr:`fragment` | 5 | Fragment identifier | empty string | - +------------------+-------+--------------------------+----------------------+ - | :attr:`username` | | User name | :const:`None` | - +------------------+-------+--------------------------+----------------------+ - | :attr:`password` | | Password | :const:`None` | - +------------------+-------+--------------------------+----------------------+ - | :attr:`hostname` | | Host name (lower case) | :const:`None` | - +------------------+-------+--------------------------+----------------------+ - | :attr:`port` | | Port number as integer, | :const:`None` | - | | | if present | | - +------------------+-------+--------------------------+----------------------+ + +------------------+-------+-------------------------+------------------------+ + | Attribute | Index | Value | Value if not present | + +==================+=======+=========================+========================+ + | :attr:`scheme` | 0 | URL scheme specifier | *scheme* parameter | + +------------------+-------+-------------------------+------------------------+ + | :attr:`netloc` | 1 | Network location part | empty string | + +------------------+-------+-------------------------+------------------------+ + | :attr:`path` | 2 | Hierarchical path | empty string | + +------------------+-------+-------------------------+------------------------+ + | :attr:`params` | 3 | No longer used | always an empty string | + +------------------+-------+-------------------------+------------------------+ + | :attr:`query` | 4 | Query component | empty string | + +------------------+-------+-------------------------+------------------------+ + | :attr:`fragment` | 5 | Fragment identifier | empty string | + +------------------+-------+-------------------------+------------------------+ + | :attr:`username` | | User name | :const:`None` | + +------------------+-------+-------------------------+------------------------+ + | :attr:`password` | | Password | :const:`None` | + +------------------+-------+-------------------------+------------------------+ + | :attr:`hostname` | | Host name (lower case) | :const:`None` | + +------------------+-------+-------------------------+------------------------+ + | :attr:`port` | | Port number as integer, | :const:`None` | + | | | if present | | + +------------------+-------+-------------------------+------------------------+ Reading the :attr:`port` attribute will raise a :exc:`ValueError` if an invalid port is specified in the URL. See section From 4cb40c6b006c0c29799aa92c75ccca684b2ac4d2 Mon Sep 17 00:00:00 2001 From: Christian Clauss Date: Sun, 28 Nov 2021 16:41:49 +0100 Subject: [PATCH 2/5] fixup! +NORMALIZE_WHITESPACE --- Doc/library/urllib.parse.rst | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/Doc/library/urllib.parse.rst b/Doc/library/urllib.parse.rst index 473e444aadbf7b..58bd62eda145a1 100644 --- a/Doc/library/urllib.parse.rst +++ b/Doc/library/urllib.parse.rst @@ -48,13 +48,16 @@ or on combining URL components into a URL string. result, except for a leading slash in the *path* component, which is retained if present. For example: + .. doctest:: + :options: +NORMALIZE_WHITESPACE + >>> from urllib.parse import urlparse >>> urlparse("scheme://netloc/path;parameters?query#fragment") ParseResult(scheme='scheme', netloc='netloc', path='/path;parameters', params='', query='query', fragment='fragment') >>> o = urlparse("http://docs.python.org:80/3/library/urllib.parse.html?" ... "highlight=params#url-parsing") - >>> o # doctest: +NORMALIZE_WHITESPACE + >>> o ParseResult(scheme='http', netloc='docs.python.org:80', path='/3/library/urllib.parse.html', params='', query='highlight=params', fragment='url-parsing') @@ -66,7 +69,7 @@ or on combining URL components into a URL string. 'docs.python.org' >>> o.port 80 - >>> o.geturl() # doctest: +NORMALIZE_WHITESPACE + >>> o.geturl() 'http://docs.python.org:80/3/library/urllib.parse.html?highlight=params# url-parsing' From 44c80b2fe89f422760dac3eda5044ccf7d694611 Mon Sep 17 00:00:00 2001 From: Christian Clauss Date: Sun, 28 Nov 2021 16:59:57 +0100 Subject: [PATCH 3/5] fixup! +NORMALIZE_WHITESPACE --- Doc/library/urllib.parse.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/library/urllib.parse.rst b/Doc/library/urllib.parse.rst index 58bd62eda145a1..2a85942a2d3972 100644 --- a/Doc/library/urllib.parse.rst +++ b/Doc/library/urllib.parse.rst @@ -71,7 +71,7 @@ or on combining URL components into a URL string. 80 >>> o.geturl() 'http://docs.python.org:80/3/library/urllib.parse.html?highlight=params# - url-parsing' + url-parsing' Following the syntax specifications in :rfc:`1808`, urlparse recognizes a netloc only if it is properly introduced by '//'. Otherwise the From b57c554faf4a2168fed78c602540e3a03d84c0e1 Mon Sep 17 00:00:00 2001 From: Christian Clauss Date: Sun, 28 Nov 2021 20:09:35 +0100 Subject: [PATCH 4/5] fixup! Normalize whitespace --- Doc/library/urllib.parse.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Doc/library/urllib.parse.rst b/Doc/library/urllib.parse.rst index 2a85942a2d3972..dc2c0ccf0e2a82 100644 --- a/Doc/library/urllib.parse.rst +++ b/Doc/library/urllib.parse.rst @@ -70,8 +70,8 @@ or on combining URL components into a URL string. >>> o.port 80 >>> o.geturl() - 'http://docs.python.org:80/3/library/urllib.parse.html?highlight=params# - url-parsing' + ('http://docs.python.org:80/3/library/urllib.parse.html?highlight=params#' + 'url-parsing') Following the syntax specifications in :rfc:`1808`, urlparse recognizes a netloc only if it is properly introduced by '//'. Otherwise the From 3f168a8a336bb31be34dfa257bb126a73a406c26 Mon Sep 17 00:00:00 2001 From: Christian Clauss Date: Sun, 28 Nov 2021 20:44:38 +0100 Subject: [PATCH 5/5] fixup! Shorten the URL to work in doctest --- Doc/library/urllib.parse.rst | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/Doc/library/urllib.parse.rst b/Doc/library/urllib.parse.rst index dc2c0ccf0e2a82..1478b34bc95514 100644 --- a/Doc/library/urllib.parse.rst +++ b/Doc/library/urllib.parse.rst @@ -69,9 +69,8 @@ or on combining URL components into a URL string. 'docs.python.org' >>> o.port 80 - >>> o.geturl() - ('http://docs.python.org:80/3/library/urllib.parse.html?highlight=params#' - 'url-parsing') + >>> o._replace(fragment="").geturl() + 'http://docs.python.org:80/3/library/urllib.parse.html?highlight=params' Following the syntax specifications in :rfc:`1808`, urlparse recognizes a netloc only if it is properly introduced by '//'. Otherwise the