bpo-36305: Fixes to path handling and parsing in pathlib by kmaork · Pull Request #12361 · python/cpython (original) (raw)
The bugs
This PR fixes three bugs with path parsing in pathlib
.
The following examples show these bugs (when the cwd is C:\d
):
WindowsPath('C:a').absolute()
should returnWindowsPath('C:\\d\\a')
but returnsWindowsPath('C:a')
.
This is caused by flawed logic in theparse_parts
method of the_Flavour
class.WindowsPath('./b:a').absolute()
should returnWindowsPath('C:\\d\\b:a')
but returnsWindowsPath('b:a')
.
This is caused by the limited interface ofparse_parts
, and affects thePath.absolute
,Path.expanduser
andPath.__rtruediv__
methods.WindowsPath('./b:a').resolve()
should returnWindowsPath('C:\\d\\b:a')
but returnsWindowsPath('b:a')
.
This is caused by missing logic in theresolve
method and inPath.__str__
The fixes
- To fix the first bug, I fixed a flaw in the
parse_parts
method. - The second one was more complicated, as with the current interface of
parse_parts
(called by_parse_args
), the bug can't be fixed. Let's take a simple example:WindowsPath(WindowsPath('./a:b'))
andWindowsPath('a:b')
- before the bugfix, they are equal. That happens because in both cases,parse_parts
is called with['a:b']
. This part can be interpreted in two ways - either as the relative path'b'
with the drive'a:'
, or as a file'a'
with the NTFS data-stream'b'
.
That means that in some cases, passing the flattened_parts
of a path toparse_parts
is lossy. Therefore we have to a modifyparse_parts
's interface to enable passing more detailed information about the given parts. What I decided to do was allow passing tuples in addition to strings, thus supporting the old interface. The tuples would contain the drive, root and path parts, enough information to determine the correct parsing of path parts with data-streams, and maybe more future edge cases.
After modifyingparse_parts
's interface, I changed_parse_args
to use it and madePath.absolute
,Path.expanduser
andPath.__rtruediv__
passPath
objects to_parse_args
instead of path parts, to preserve the path information. - To solve the third bug I had to make small changes to both the
resolve
method and toPath.__str__
.
Notes
In addition to the changes in the code, I've added regression tests and modified old incorrect tests.
Details about drive-relative paths can be found here.