Issue 37609: support "UNC" device paths in ntpath.splitdrive (original) (raw)

Created on 2019-07-17 11:26 by eryksun, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
splitdrive.py eryksun,2020-10-27 17:15
Pull Requests
URL Status Linked Edit
PR 14841 open nsiregar,2019-07-18 14:37
PR 25261 closed steve.dower,2021-04-07 19:18
PR 31702 open barneygale,2022-03-06 07:09
Messages (9)
msg348055 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2019-07-17 11:26
Windows Python includes UNC shares such as "//server/spam" in its definition of a drive. This is natural because Windows supports setting a UNC path as the working directory and handles the share component as the working drive when resolving rooted paths such as "/eggs". For the sake of generality when working with \\?\ extended paths, Python should expand its definition of a UNC drive to include "UNC" device paths. A practical example is calling glob.glob with a "//?/UNC" device path. >>> import os, sys, glob >>> sys.addaudithook(lambda s,a: print('#', a[0]) if s == 'glob.glob' else None) regular UNC path: >>> glob.glob('//localhost/C$/Sys*') # //localhost/C$/Sys* ['//localhost/C$/System Volume Information'] "UNC" device path: >>> glob.glob('//?/UNC/localhost/C$/Sys*') # //?/UNC/localhost/C$/Sys* # //?/UNC/localhost/C$ # //?/UNC/localhost # //?/UNC/ [] Since the magic character "?" is in the path (IMO, the drive should be excluded from this check, but that's a separate issue), the internal function glob._iglob calls itself recursively until it reaches the base case of dirname == pathname, where dirname is from os.path.split(pathname). The problem here is that ntpath.split doesn't stop at the proper base case of "//?/UNC/localhost/C$". This is due to ntpath.splitdrive. For example: >>> os.path.splitdrive('//?/UNC/localhost/C$/Sys*') ('//?/UNC', '/localhost/C$/Sys*') >>> os.path.splitdrive('//./UNC/localhost/C$/Sys*') ('//./UNC', '/localhost/C$/Sys*') The results should be "//?/UNC/localhost/C$" and "//./UNC/localhost/C$". In other cases, returning a device as the drive is fine, if not exactly meaningful (e.g. "//./NUL"). I don't think this needs to change.
msg348099 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-07-18 06:26
Do you want to create a PR Eryk?
msg348206 - (view) Author: Ngalim Siregar (nsiregar) * Date: 2019-07-20 01:23
I was unsure about implementation in the patch, do you have UNC format specification?
msg352110 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-09-12 11:12
For clarity, given Eryk's examples above, both "\\?\UNC\" and "//?/UNC/" are okay (as are any combination of forward and backslashes in the prefix, as normalization will be applied for any except the "\\?\" version). "UNC" is also case-insensitive.
msg352355 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2019-09-13 16:05
Please consult the attached file "splitdrive.py". I redesigned splitdrive() to support "UNC" and "GLOBAL" junctions in device paths. I relaxed the design to allow repeated separators everywhere except for the UNC root. IIRC, Windows has supported this since XP. For example: >>> print(nt._getfullpathname('//server///share')) \\server\share >>> print(nt._getfullpathname(r'\\server\\\share')) \\server\share There are also a couple of minor behavior changes in the new implementation. The old implementation would split "//server/" as ('//server/', ''). Since there's no share, this should not count as a drive. The new implementation splits it as ('', '//server/'). Similarly it splits '//?/UNC/server/' as ('', '//?/UNC/server/'). The old implementation also allowed any character as a drive 'letter'. For example, it would split '/:/spam' as ('/:', '/spam'). The new implementation ensures that the drive letter in a DOS drive is alphabetic. I also extended test_splitdrive to use a list of test cases in order to avoid having to define each case twice. It calls tester() a second time for each case, with slash and backslash swapped.
msg379780 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2020-10-27 17:15
I'm attaching a rewrite of splitdrive() from . This version uses an internal _next() function to get the indices of the next path component, ignoring repeated separators. It also flattens the nested structure of the previous implementation by adding multiple return statements.
msg390375 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-04-06 20:58
Once is merged, I've got a fairly simple implementation for this using the (new) nt._path_splitroot native method, as well as improved tests that cover both the native and emulated calculations.
msg414604 - (view) Author: Barney Gale (barneygale) * Date: 2022-03-06 02:36
I'd like to pick this up, as it would allow us to remove a duplicate implementation in pathlib with its own shortcomings. If using native functionality if difficult to get right, could I put @eryksun's splitdrive.py implementation up for review?
msg414621 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2022-03-06 18:47
If you can build this on top of nt._path_splitroot then it could save a decent amount of work, though at the same time I think it's worthwhile having a pure Python implementation which is cross-platform. Haven't looked at the PR yet (or Eryk's implementation recently), but at the very least it would be nice to have tests that verify consistency with nt._path_splitroot. That way at least we'll discover if the native version changes.
History
Date User Action Args
2022-04-11 14:59:18 admin set github: 81790
2022-03-06 18:47:25 steve.dower set messages: +
2022-03-06 11:02:32 eryksun set messages: -
2022-03-06 07:09:23 barneygale set pull_requests: + <pull%5Frequest29822>
2022-03-06 02:36:38 barneygale set nosy: + barneygalemessages: +
2021-04-09 17:58:00 steve.dower set assignee: steve.dower ->
2021-04-07 19🔞46 steve.dower set pull_requests: + <pull%5Frequest23997>
2021-04-07 00:26:39 eryksun set messages: +
2021-04-06 20:58:54 steve.dower set assignee: steve.dowermessages: + versions: + Python 3.10, - Python 3.8, Python 3.9
2021-03-28 02:09:36 eryksun link issue38948 dependencies
2021-02-25 15:56:06 eryksun set files: - splitdrive.py
2020-10-27 17:15:45 eryksun set files: + splitdrive.pymessages: +
2020-10-27 14🔞18 eryksun link issue42170 superseder
2019-09-13 16:05:15 eryksun set files: + splitdrive.pymessages: +
2019-09-12 11:12:24 steve.dower set messages: +
2019-07-20 01:23:47 nsiregar set nosy: + nsiregarmessages: +
2019-07-18 14:37:15 nsiregar set keywords: + patchstage: needs patch -> patch reviewpull_requests: + <pull%5Frequest14632>
2019-07-18 06:26:06 serhiy.storchaka set nosy: + serhiy.storchakamessages: +
2019-07-17 11:26:34 eryksun create