msg348055 - (view) |
Author: Eryk Sun (eryksun) *  |
Date: 2019-07-17 11:26 |
Windows Python includes UNC shares such as "//server/spam" in its definition of a drive. This is natural because Windows supports setting a UNC path as the working directory and handles the share component as the working drive when resolving rooted paths such as "/eggs". For the sake of generality when working with \\?\ extended paths, Python should expand its definition of a UNC drive to include "UNC" device paths. A practical example is calling glob.glob with a "//?/UNC" device path. >>> import os, sys, glob >>> sys.addaudithook(lambda s,a: print('#', a[0]) if s == 'glob.glob' else None) regular UNC path: >>> glob.glob('//localhost/C$/Sys*') # //localhost/C$/Sys* ['//localhost/C$/System Volume Information'] "UNC" device path: >>> glob.glob('//?/UNC/localhost/C$/Sys*') # //?/UNC/localhost/C$/Sys* # //?/UNC/localhost/C$ # //?/UNC/localhost # //?/UNC/ [] Since the magic character "?" is in the path (IMO, the drive should be excluded from this check, but that's a separate issue), the internal function glob._iglob calls itself recursively until it reaches the base case of dirname == pathname, where dirname is from os.path.split(pathname). The problem here is that ntpath.split doesn't stop at the proper base case of "//?/UNC/localhost/C$". This is due to ntpath.splitdrive. For example: >>> os.path.splitdrive('//?/UNC/localhost/C$/Sys*') ('//?/UNC', '/localhost/C$/Sys*') >>> os.path.splitdrive('//./UNC/localhost/C$/Sys*') ('//./UNC', '/localhost/C$/Sys*') The results should be "//?/UNC/localhost/C$" and "//./UNC/localhost/C$". In other cases, returning a device as the drive is fine, if not exactly meaningful (e.g. "//./NUL"). I don't think this needs to change. |
|
|
msg348099 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2019-07-18 06:26 |
Do you want to create a PR Eryk? |
|
|
msg348206 - (view) |
Author: Ngalim Siregar (nsiregar) * |
Date: 2019-07-20 01:23 |
I was unsure about implementation in the patch, do you have UNC format specification? |
|
|
msg352110 - (view) |
Author: Steve Dower (steve.dower) *  |
Date: 2019-09-12 11:12 |
For clarity, given Eryk's examples above, both "\\?\UNC\" and "//?/UNC/" are okay (as are any combination of forward and backslashes in the prefix, as normalization will be applied for any except the "\\?\" version). "UNC" is also case-insensitive. |
|
|
msg352355 - (view) |
Author: Eryk Sun (eryksun) *  |
Date: 2019-09-13 16:05 |
Please consult the attached file "splitdrive.py". I redesigned splitdrive() to support "UNC" and "GLOBAL" junctions in device paths. I relaxed the design to allow repeated separators everywhere except for the UNC root. IIRC, Windows has supported this since XP. For example: >>> print(nt._getfullpathname('//server///share')) \\server\share >>> print(nt._getfullpathname(r'\\server\\\share')) \\server\share There are also a couple of minor behavior changes in the new implementation. The old implementation would split "//server/" as ('//server/', ''). Since there's no share, this should not count as a drive. The new implementation splits it as ('', '//server/'). Similarly it splits '//?/UNC/server/' as ('', '//?/UNC/server/'). The old implementation also allowed any character as a drive 'letter'. For example, it would split '/:/spam' as ('/:', '/spam'). The new implementation ensures that the drive letter in a DOS drive is alphabetic. I also extended test_splitdrive to use a list of test cases in order to avoid having to define each case twice. It calls tester() a second time for each case, with slash and backslash swapped. |
|
|
msg379780 - (view) |
Author: Eryk Sun (eryksun) *  |
Date: 2020-10-27 17:15 |
I'm attaching a rewrite of splitdrive() from . This version uses an internal _next() function to get the indices of the next path component, ignoring repeated separators. It also flattens the nested structure of the previous implementation by adding multiple return statements. |
|
|
msg390375 - (view) |
Author: Steve Dower (steve.dower) *  |
Date: 2021-04-06 20:58 |
Once is merged, I've got a fairly simple implementation for this using the (new) nt._path_splitroot native method, as well as improved tests that cover both the native and emulated calculations. |
|
|
msg414604 - (view) |
Author: Barney Gale (barneygale) * |
Date: 2022-03-06 02:36 |
I'd like to pick this up, as it would allow us to remove a duplicate implementation in pathlib with its own shortcomings. If using native functionality if difficult to get right, could I put @eryksun's splitdrive.py implementation up for review? |
|
|
msg414621 - (view) |
Author: Steve Dower (steve.dower) *  |
Date: 2022-03-06 18:47 |
If you can build this on top of nt._path_splitroot then it could save a decent amount of work, though at the same time I think it's worthwhile having a pure Python implementation which is cross-platform. Haven't looked at the PR yet (or Eryk's implementation recently), but at the very least it would be nice to have tests that verify consistency with nt._path_splitroot. That way at least we'll discover if the native version changes. |
|
|