Skip to content

pathlib paths .normalize() #83105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
iciocirlan mannequin opened this issue Nov 26, 2019 · 11 comments
Closed

pathlib paths .normalize() #83105

iciocirlan mannequin opened this issue Nov 26, 2019 · 11 comments
Labels
3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@iciocirlan
Copy link
Mannequin

iciocirlan mannequin commented Nov 26, 2019

BPO 38924
Nosy @brettcannon, @pitrou, @serhiy-storchaka, @vedgar, @tirkarthi

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2019-12-02.17:47:39.857>
created_at = <Date 2019-11-26.22:23:45.876>
labels = ['type-feature', 'library', '3.9']
title = 'pathlib paths .normalize()'
updated_at = <Date 2019-12-03.18:37:12.801>
user = 'https://bugs.python.org/iciocirlan'

bugs.python.org fields:

activity = <Date 2019-12-03.18:37:12.801>
actor = 'brett.cannon'
assignee = 'none'
closed = True
closed_date = <Date 2019-12-02.17:47:39.857>
closer = 'brett.cannon'
components = ['Library (Lib)']
creation = <Date 2019-11-26.22:23:45.876>
creator = 'iciocirlan'
dependencies = []
files = []
hgrepos = []
issue_num = 38924
keywords = []
message_count = 11.0
messages = ['357534', '357557', '357575', '357633', '357634', '357635', '357650', '357653', '357721', '357724', '357758']
nosy_count = 6.0
nosy_names = ['brett.cannon', 'pitrou', 'serhiy.storchaka', 'veky', 'xtreak', 'iciocirlan']
pr_nums = []
priority = 'normal'
resolution = 'rejected'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue38924'
versions = ['Python 3.9']

@iciocirlan
Copy link
Mannequin Author

iciocirlan mannequin commented Nov 26, 2019

pathlib paths should expose a .normalize() method. This is highly useful, especially in web-related scenarios.

On PurePath its usefulness is obvious, but it's debatable for Path, as it would yield different results from .resolve() in case of symlinks (which resolves them before normalizing).

@iciocirlan iciocirlan mannequin added 3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Nov 26, 2019
@tirkarthi
Copy link
Member

Can you please add an example of how normalize() should behave? I assume you want the same behaviour as os.path.normpath which already accepts a pathlike object to be added to pathlib.

@brettcannon
Copy link
Member

Do note that Path inherits from PurePath, so providing a normalize() method on the latter means it will end up on the former.

@iciocirlan
Copy link
Mannequin Author

iciocirlan mannequin commented Nov 29, 2019

Can you please add an example of how normalize() should behave?

>>> mypath = PurePosixPath("foo/bar/bzz")
>>> mypath /= "../../"
>>> mypath
PurePosixPath('foo/bar/bzz/../..')
>>> mypath = mypath.normalize()
>>> mypath
PurePosixPath('foo')
>>> mypath /= "../../../there"
>>> mypath
PurePosixPath('foo/../../../there')
>>> mypath = mypath.normalize()
>>> mypath
PurePosixPath('../../there')
>>> mypath /= "../../and/back/again"
>>> mypath
PurePosixPath('../../there/../../and/back/again')
>>> mypath = mypath.normalize()
>>> mypath
PurePosixPath('../../../and/back/again')

I assume you want the same behaviour as os.path.normpath which already accepts a pathlike object to be added to pathlib.

Yes, exactly the same behaviour, but arguing that normpath() can take a pathlib object is just saying that it saves you from doing an intermediate str(), which is, well, nice, but still not pretty. Consider mypath = mypath.normalize() vs. mypath = PurePosixPath(normpath(mypath)).

Do note that Path inherits from PurePath, so providing a normalize() method on the latter means it will end up on the former.

That could be "circumvented" with a bit of code shuffling, e.g. moving everything from PurePath to a PathBase or _Path or somesuch, and forking the inheritance from there. On the other hand, it might be useful. I personally can't think of a scenario, but the GNU folk certainly think so, see realpath --logical: https://www.gnu.org/software/coreutils/manual/html_node/realpath-invocation.html

@tirkarthi
Copy link
Member

Yes, exactly the same behaviour, but arguing that normpath() can take a pathlib object is just saying that it saves you from doing an intermediate str(), which is, well, nice, but still not pretty. Consider mypath = mypath.normalize() vs. mypath = PurePosixPath(normpath(mypath)).

From my experience in the past the intention has been to keep the API minimal and below are some recent additions. Many discussions lead to the answer over using a function that accepts a pathlike object already and if not add support for it than add the API to pathlib itself. I will leave it to the experts on this.

realink : bpo-30618
link_to : bpo-26978

@vedgar
Copy link
Mannequin

vedgar mannequin commented Nov 29, 2019

I think the real issue here

mypath = PurePosixPath(normpath(mypath))

is the PurePosixPath wrapper. It is nice that normpath _accepts_ pathlike objects, but it should then not return a generic str. It should try to return an object of the same type.

Of course it's harder to do, especially in presence of pathlike objects of unknown classes, but with some reasonable assumptions on the constructors, it can be done---and it's much more useful. The similar debate, with similar conclusions, has already happened with datetime-like objects.

@brettcannon
Copy link
Member

From my experience in the past the intention has been to keep the API minimal

Correct, all of os.path and shutils does not need to end up in pathlib. :) Hence why the request to add things is always tough as we have to try and strike a balance of useful but not overwhelming/overdone (and what is "useful" varies from person to person).

It is nice that normpath _accepts_ pathlike objects, but it should then not return a generic str. It should try to return an object of the same type.

It's an interesting idea, but it's also difficult to get right, even with assumptions as things that represent a path are nowhere near as unified as dates. There would also be a ton of converting back and forth in os.path as functions call other functions to get the path, manipulate it, and then wrap it back up.

But if someone can come up with a concrete proposal with some example implementation and brings it to python-ideas it could be discussed.

@serhiy-storchaka
Copy link
Member

There were reasons why something like PurePath.normalize() was not added at first place. os.path.normpath() is broken by design. It does not work as you expect in case when the .. component is preceeded by a symlink. Its behvior can lead to bugs and maybe even security issues. We did not want to add something so dubious in the pathlib module. Path.resolve() is the correct way.

So I suggest to close this issue.

@brettcannon
Copy link
Member

I'm going with Serhiy's recommendation and closing this. Sorry, Ionuț.

@iciocirlan
Copy link
Mannequin Author

iciocirlan mannequin commented Dec 2, 2019

Brett and Serhiy, you do realise there are no symlinks to resolve on PurePaths, right?

os.path.normpath() is broken by design.

Why don't you deprecate it then? Sounds like the reasonable thing to do, no? So many innocent souls endangered by this evil function...

It's broken by design if you use it to shoot yourself in the foot. If you want however to normalize an abstract path, an absolutely reasonable thing to do, it does the right and very useful thing. Because, well, the filesystem isn't the only thing that has paths and other things don't have symlinks. Also, this lib is called pathlib, not fspathlib, *and* someone had the foresight of separating filesystem paths from abstract paths. Quite a strange series of coincidences, no?

Let me quote the initial comment for this issue, which apparently noone read:

On PurePath its usefulness is obvious, but it's debatable for Path, as it would yield different results from .resolve() in case of symlinks (which resolves them before normalizing).

@brettcannon
Copy link
Member

While I understand you're disappointed, do realize that the tone of your response isn't necessary. I'm going to assume you didn't mean for it to come off as confrontational and still provide a reply.

you do realise there are no symlinks to resolve on PurePaths, right?

Yes.

Why don't you deprecate it then?

Because the amount of code that would break for those that are willing to deal with its drawbacks is way too vast. But just because we keep that function around even with its drawbacks doesn't meant we want to propagate that in newer code.

Let me quote the initial comment for this issue, which apparently noone read

We read it, but as I said in response, "Path inherits from PurePath, so providing a normalize() method on the latter means it will end up on the former". Now I know you suggested putting in code to somehow hide it from Path, but we try to avoid being so magical in the stdlib, especially when it would require some careful explanation in the docs that for some reason a method on an inherited class wasn't available.

Please note you can also use your own subclass or function to get the functionality you are after. There is nothing special to what you are asking for that requires inclusion in the stdlib.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants