msg129616 - (view) |
Author: blokeley (blokeley) |
Date: 2011-02-27 08:26 |
It is a common need to find the grandparent or great-grandparent (etc.) of a given directory, which results in this: >>> from os.path import dirname >>> mydir = dirname(dirname(dirname(path))) Could a "height" parameter be added to os.path.dirname so it becomes: >>> def dirname(path, height=1): Then we could have usage like: >>> path = '/ggparent/gparent/parent/myfile.txt' >>> from os.path import dirname >>> dirname(path) /ggparent/gparent/parent >>> dirname(path, 2) /ggparent/gparent >>> dirname(path, 3) /ggparent Perhaps we should throw ValueErrors for invalid height values: >>> dirname(path, 10) ValueError >>> dirname(path, -1) ValueError Perhaps a height of 0 should do nothing: >>> dirname(path, 0) /ggparent/gparent/parent/myfile.txt I can supply patches, unit tests and docs if you like. |
|
|
msg129635 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2011-02-27 15:47 |
I'm -1 on this feature request. I think it is an unnecessary complication of the API, especially since dirname corresponds to the unix shell 'dirname' command, which doesn't have such a feature. If you need this feature in a particular application, it is easy to write a function to provide it. |
|
|
msg129636 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2011-02-27 15:51 |
Well, on the other hand, it *is* a common need. (and I don't think mimicking the shell is a design principle for Python) |
|
|
msg129640 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2011-02-27 17:45 |
No, it isn't a design principle. My point was that unix hasn't found it useful to add a level option to the dirname API. I don't know that I personally have ever had occasion to peel off more than one directory level without also wanting to do something with the intermediate results, so perhaps I am not a good judge of how useful this would be. |
|
|
msg130078 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2011-03-04 21:49 |
I am inclined to -1 also. a. The proposed behavior is anti-obvious to me: the higher the height, the shorter the result. Calling param 'drop' would be better. b. Not every one-liner should be wrapped. >>> path.rsplit('/',0)[0] '/ggparent/gparent/parent/myfile.txt' >>> path.rsplit('/',1)[0] '/ggparent/gparent/parent' >>> path.rsplit('/',2)[0] '/ggparent/gparent' >>> path.rsplit('/',3)[0] '/ggparent' Note: above gives '' for maxsplit out of range, easily converted to exception in function wrapper. |
|
|
msg130079 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2011-03-04 22:00 |
Except that dirname() isn't a one-liner, so you are giving rather bad advice here. |
|
|
msg130081 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2011-03-04 22:04 |
As for use cases, I have used it quite commonly in test scripts in order to find out the base directory of the source tree (and/or other resources such as data files). e.g.: basepath = os.path.dirname(os.path.dirname(os.path.dirname(__file__))) |
|
|
msg130107 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2011-03-05 07:58 |
> My point was that unix hasn't found it useful > to add a level option to the dirname API. ISTM, that is a strong indication that this isn't needed in the form it has been proposed. > I don't know that I personally have ever had > occasion to peel off more than one directory level > without also wanting to do something with the > intermediate results, so perhaps I am not a good > judge of how useful this would be. I think this only arises when a known directory structure has been attached at some arbitrary point on a tree, so you might use a relative path like ../../bin/command.py in the shell. To serve that use case, it would be better to have a function that splits all the components of the path into a list that's easily manipulated: >>> oldpath = os.path.splitpath('/ggparent/gparent/parent/') >>> newpath = oldpath[:-2] + ['bin', 'command.py'] >>> os.path.join(*newpath) '/ggparent/bin/command.py' |
|
|
msg130119 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2011-03-05 13:26 |
Ah, yes, splitpath is a function I've occasionally wanted. I also remember being surprised that os.path.split didn't return such a list. |
|
|
msg130344 - (view) |
Author: blokeley (blokeley) |
Date: 2011-03-08 17:47 |
os.path.splitpath() as described by rhettinger would solve the problem. If I wrote the patches, tests and docs, what are the chances of it being accepted? |
|
|
msg130345 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2011-03-08 17:53 |
> If I wrote the patches, tests and docs, what are the chances of it > being accepted? Rather high as far as I'm concerned. Be careful with semantics and implementation under Windows, though (you should probably take a look at existing functions in ntpath.py as a guideline). |
|
|
msg130373 - (view) |
Author: blokeley (blokeley) |
Date: 2011-03-08 20:54 |
I started writing the patch against py2.7 but realised that 2.7 could be the last in the 2.x series. I'll write the patch against default tip. |
|
|
msg134740 - (view) |
Author: blokeley (blokeley) |
Date: 2011-04-29 08:59 |
The unit tests on the cpython tip revision fail even before applying my patches and I'm afraid haven't got the time to debug the threading module or existing unit tests. The traceback is: C:\workspace\cpython\Lib\test> C:\Python32\python.exe test_ntpath.py Traceback (most recent call last): File "test_ntpath.py", line 4, in from test.support import TestFailed File "C:\workspace\cpython\Lib\test\support.py", line 14, in import shutil File "C:\workspace\cpython\Lib\shutil.py", line 17, in import bz2 File "C:\workspace\cpython\Lib\bz2.py", line 13, in import threading File "C:\workspace\cpython\Lib\threading.py", line 34, in _info = _thread.info AttributeError: 'module' object has no attribute 'info' It happens with cpython hg rev 8eb794bbb967 If there's a quick fix for this, please advise and I'll get working. If not, I'll probably not have the time to fix it myself and then write the os.path.splitpath patches as well which would be a pity. |
|
|
msg134749 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2011-04-29 11:32 |
Did you try a make distclean/configure/make? _thread.info is a new attribute introduced by a relatively recent patch. |
|
|
msg134770 - (view) |
Author: blokeley (blokeley) |
Date: 2011-04-29 14:23 |
My runtime came from the Python32 Windows installer and I don't have a C compiler on this machine. Therefore I updated to the 3.2 branch in hg and worked on that. This patch is pretty simple so should work on 3.3 without modifications. I have attached my first iteration of the patch (patched against hg rev 56c187b81d2b). Disclaimers and suspected issues: * A path given as a byte array is converted to a string so splitpath() only returns lists of strings and never lists of byte arrays. I don't know if splitpath() should return a list of byte arrays if the path was a byte array. The way split() is tested implies not. Please advise. * We might need more tests to cover more path variations on Windows. * I haven't implemented splitpath() in os2emxpath.py because I couldn't find test/test_os2emxpath.py or the equivalent. Please advise if there is one or if I should create one. Feedback and patches most welcome. |
|
|
msg134873 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2011-04-30 15:55 |
To clarify one point: Python does not try to mimic the shell, but the os module exposes system calls as they are. (Unrelated remark: pkgutil.get_data can replace a lot of uses of dirname(dirname(__file__))) |
|
|
msg140271 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2011-07-13 15:27 |
I’m not sure this is correct for POSIX: splitpath('/gparent/parent/') returns ['gparent', 'parent'] / is a real directory, it should be the ultimate parent of any path IIUC. On a related note, using “parent” for the leaf file looks strange to me, I think something like this would make more sense: /gparent/parent/somedir/ |
|
|
msg176781 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-12-02 10:51 |
splitpath() should be equivalent to followed code (but be non-recursive and more effective): def splitpath(path): head, tail = split(path) if head == path: return [head] else: return splitpath(head) + [tail] |
|
|
msg176807 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-12-02 20:45 |
The proposed patch adds effective implementations of mentioned above algorithm. splitpath() can be used for consecutive implementation of relpath() and commonpath(). |
|
|
msg178608 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-12-30 19:44 |
Please review. This function is very important for many applications (and it hard to get right). |
|
|
msg178610 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-12-30 19:51 |
> Please review. This function is very important for many applications > (and it hard to get right). The pathlib module (PEP 428) has such functionality built-in. |
|
|
msg203196 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-11-17 15:59 |
The pathlib module is not in the stdlib yet, while a patch for splitpath() waits for review almost a year. |
|
|
msg203291 - (view) |
Author: Martin Panter (martin.panter) *  |
Date: 2013-11-18 13:18 |
The ntpath.splitpath() version is easy to get lost in. It would probably help if you spelt out all the single-letter variable names, and explained that tri-state root/separator = None/True/False flag. Maybe there is a less convoluted way to write it too, I dunno. Also, maybe it is worth clearly documenting a couple special properties of the result: * The first element is always the root component (for an absolute path), or an empty string (for a relative path) * The last element is an empty string if the path name ended in a directory separator, except when the path is a root directory |
|
|
msg203310 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-11-18 17:29 |
Added examples and Martin's notes to the documentation. ntpath implementation rewrote with regular expressions (it is now shorter and perhaps more clear). |
|
|
msg222967 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2014-07-13 19:13 |
Updated patch. Added private general implementation in genericpath and specialized implementations are now tested to return the same result as general implementation. |
|
|
msg238564 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2015-03-19 21:55 |
pathlib is in the stdlib now (see previous comments), maybe this should be closed as obsolete. |
|
|
msg239657 - (view) |
Author: Martin Panter (martin.panter) *  |
Date: 2015-03-31 02:38 |
I think my use cases of splitpath() could be fulfilled by using Path.parts, Path.anchor, Path.relative_to(), etc. I am a bit sad that this never made it in, but I agree it is redundant with pathlib, and the issue should probably be closed. |
|
|
msg239676 - (view) |
Author: Paul Moore (paul.moore) *  |
Date: 2015-03-31 09:33 |
Assuming new code should be using pathlib, I agree this should probably be closed now as obsolete. One proviso - pathlib objects don't take a bytestring path in the constructor. If there's a need for a low-level splitpath taking bytes objects, there may still be a benefit to this patch. Otherwise pathlib.Path(p).parts is a direct replacement AFAICT. |
|
|
msg241625 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2015-04-20 09:04 |
I thought that splitpath() could be used in implementations of realpath(), relpath(), commonpath(), and in user code. But looks as realpath(), relpath() and commonpath() should use specialized inlined versions for efficiency, and user code can use more highlevel pathlib. For now there are no strong arguments for adding splitpath(). The patch is updated to the tip (for the case if it will needed in the future) and the issue is closed. |
|
|