gh-99029: Fix handling of PureWindowsPath('C:\<blah>').relative_to('C:') by barneygale · Pull Request #99031 · python/cpython (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation20 Commits5 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

barneygale

@barneygale

…e_to('C:')`

relative_to() now treats naked drive paths as relative. This brings its behaviour in line with other parts of pathlib, and with ntpath.relpath(), and so allows us to factor out the pathlib-specific implementation.

This was referenced

Nov 2, 2022

@domragusa

The implementation of os.path.relpath accesses the filesystem making the two paths absolute and then calculating the common path, using that function to implement a PurePath method is wrong becase it should work without the any file or directory being there (its definition is "Base class for manipulating paths without I/O"), the use case that it breaks is when we don't have files on disk for example I use it to create the structure of a zip archive.

Removing tests is not a good idea.

Mixing relative and absolute paths should raise an exception as you stated and should be fixed.

@barneygale

I didn't remove any tests. Please share an example of where this breaks.

@barneygale

I told a lie - I actually did remove two assertions that supply strings to the method. Will fix!

@barneygale

@domragusa

I thought that os.path.relpath accessed the filesystem, but taking a better look at it I guess it doesn't.

I don't have a Windows machine and I can't provide a concrete example, I'm sorry, anyway it calls os.path.abspath and following the chain we arrive at a Windows api call GetFullPathNameW, that's where I expect PurePath to break because it should be platform independent and this change would make it depend on the result of a winapi call.
Can you test it with names and characters that are reserved on Windows but are allowed on Posix-systems?

I would have just tweaked the fail condition to make sure that we raise the exception for the bug you noticed.

@barneygale

👍 thanks, I'll try that out!

It's true that relpath() can call os.getcwd(). However, the working directory only contributes to the result if we pass one relative path and one absolute. And that's specifically disallowed in pathlib's relative_to(). In other cases, the two matching prefixes cancel eachother out in commonpath().

The fact that relpath() needlessly calls abspath() when given two relative paths is regrettable, and probably worth fixing. But it shouldn't have an observable effect here. I'll try to find a counterexample.

@domragusa

It's not just getcwd, for example "c:" is a valid filename on Linux: on Windows after abspath you would get the current directory on the C drive while on Linux it's the file named "c:" in the current working directory so the calculations that follow would be wrong, there are also reserved names on Windows like COM or CON and characters that are not allowed.

@barneygale

I don't think that matters - os.getcwd() can return something totally nonsensical and we wouldn't care, as long as it returns the same thing twice in a row, so that the shared prefix can be eliminated. Right?

@domragusa

You're referencing os.getcwd() because you're looking at the implementation of os.path.abspath in os.posixpath but on Windows os.path resolves to os.ntpath and in that module os.path.abspath calls _getfullpathname() that calls the windows api.

@barneygale

Again, I don't think it matters. If you think it does, could you please provide a reproduction case. Thank you.

@barneygale

I've logged #99199 to cover relpath() needlessly calling abspath() when the paths' anchors match.

@barneygale

Differences in handling of '..' parts still make these functions fundamentally incompatible.

@barneygale

Turns out there's another incompatibility related to leading ../ segments - see #99199 (comment)

I've revised the implementation to not use relpath() :)

@barneygale

brettcannon

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is in line with the result of the discussion in the issue, so only last change is to use f-strings!

@bedevere-bot

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@barneygale

@barneygale

I have made the requested changes; please review again.

@bedevere-bot

Thanks for making the requested changes!

@brettcannon: please review the changes made to this pull request.

brettcannon

@brettcannon

@barneygale

@lazka lazka mentioned this pull request

Nov 19, 2024