msg153302 - (view) |
Author: Jason R. Coombs (jaraco) *  |
Date: 2012-02-13 20:57 |
When using a MANIFEST.in with only "include *.txt", on Windows, distutils grabs too many files. I set DISTUTILS_DEBUG=1 and ran ./setup.py sdist on the keyring library and it included this output: include *.txt include_pattern: applying regex r'^[^/]*\.txt\Z(?ms)' adding CHANGES.txt adding CONTRIBUTORS.txt adding .hg\last-message.txt adding keyring.egg-info\dependency_links.txt adding keyring.egg-info\requires.txt adding keyring.egg-info\SOURCES.txt adding keyring.egg-info\top_level.txt As you can see, this is not a recursive include, but it's matching several files not in the supplied pattern (files in .hg/ and keyring.egg-info/). It's obvious from the regular expression that the regex doesn't properly account for os.sep. |
|
|
msg153341 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2012-02-14 15:38 |
This code just got changed (#13193) to use '/' instead of os.path.join, as it is documented that paths in MANIFEST.in must use only '/'. Can you build an up-to-date Python 2.7 from Mercurial and check again? Your report comes timely, as there was some doubt about what the correct fix was. Even if your bug is now gone, I’ll want to add a regression test for it. |
|
|
msg153371 - (view) |
Author: Nadeem Vawda (nadeem.vawda) *  |
Date: 2012-02-14 21:32 |
I've been able to reproduce this on current builds of 2.7, 3.2 and 3.3. The problem seems to be that distutils.filelist pathnames using the OS directory separator, but the regexps used for matching paths are written to always assume "/"-based paths. I've been able to get the correct behaviour in this case by making the module build paths using "/" regardless of the OS directory separator: --- a/Lib/distutils/filelist.py +++ b/Lib/distutils/filelist.py @@ -55,10 +55,10 @@ def sort(self): # Not a strict lexical sort! - sortable_files = sorted(map(os.path.split, self.files)) + sortable_files = sorted(f.split("/") for f in self.files) self.files = [] for sort_tuple in sortable_files: - self.files.append(os.path.join(*sort_tuple)) + self.files.append("/".join(sort_tuple)) # -- Other miscellaneous utility methods --------------------------- @@ -258,7 +258,7 @@ for name in names: if dir != os.curdir: # avoid the dreaded "./" syndrome - fullname = os.path.join(dir, name) + fullname = dir + "/" + name else: fullname = name I'm not entirely comfortable with this fix, though - it seems like it would be better to produce paths using the OS directory separator at the end of the process. Maybe it's possible to add a translation step after the file list is built and de-duplicated? I don't know much about how the rest of distutils uses this module. (Random aside: do we support any platforms that don't support "/" as a directory separator (even if it isn't the preferred one)? Hmm...) |
|
|
msg153378 - (view) |
Author: Jason R. Coombs (jaraco) *  |
Date: 2012-02-15 00:04 |
The bug is indeed fixed in the latest 2.7 tip: PS C:\Users\jaraco\projects\public\keyring> C:\Users\jaraco\projects\public\cpython\PCbuild\amd64\python.exe setup.py sdist 2> NULL | findstr .hg (produces no output) |
|
|
msg153379 - (view) |
Author: Jason R. Coombs (jaraco) *  |
Date: 2012-02-15 00:06 |
If always using a posix path separator, I recommend using posixpath.join instead of hard-coding the path separator. |
|
|
msg153391 - (view) |
Author: Nadeem Vawda (nadeem.vawda) *  |
Date: 2012-02-15 07:24 |
> The bug is indeed fixed in the latest 2.7 tip: > > PS C:\Users\jaraco\projects\public\keyring> C:\Users\jaraco\projects\public\cpython\PCbuild\amd64\python.exe setup.py sdist 2> NULL | findstr .hg > (produces no output) I suspect you forgot to set DISTUTILS_DEBUG before running this - otherwise that command should at least output a line like this: exclude_pattern: applying regex r'(^ |
/ |
\\)(RCS |
msg153403 - (view) |
Author: Jason R. Coombs (jaraco) *  |
Date: 2012-02-15 15:34 |
It's not that I forgot to set DISTUTILS_DEBUG, I simply did not set it. If the bug were still present, I would have seen a line indicating that .hg/last-message.txt was being copied. For completeness, here's the output with the DEBUG setting: PS C:\Users\jaraco\projects\public\keyring> $env:DISTUTILS_DEBUG=1 PS C:\Users\jaraco\projects\public\keyring> ..\cpython\PCbuild\amd64\python.exe setup.py sdist 2> NULL | findstr hg adding .hg\last-message.txt exclude_pattern: applying regex r'(^ |
/ |
\\)(RCS |
msg153404 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2012-02-15 15:42 |
So it looks like that even if the exclusion of .hg removes .hg/last-message.txt, it should not have been added in the first place, as the command was include and not recursive-include. At first glance, Nadeem’s proposed fix is not right: paths in MANIFEST.in use '/', but then filelist produces paths using os.sep, so that the MANIFEST file and other operations done by the sdist command use native paths. So even though the currently supported OSes all accept '/', I think the right thing is to use os.sep. (About posixpath: It is always available and can be used e.g. to manipulate URI paths.) |
|
|
msg153428 - (view) |
Author: Nadeem Vawda (nadeem.vawda) *  |
Date: 2012-02-15 19:26 |
> At first glance, Nadeem’s proposed fix is not right: paths in MANIFEST.in use '/', but then filelist produces paths using os.sep, so that the MANIFEST file and other operations done by the sdist command use native paths. So even though the currently supported OSes all accept '/', I think the right thing is to use os.sep. Yes, that sounds like a better solution. So the solution then is to fix the regexps to use os.sep instead of "/" (and ensure that FileList uses native paths throughout). As an aside, it seems that the failing test from issue 13193 was actually correct, and that the library itself was broken. I suppose all of the tests will need to be changed to use native paths when this issue is fixed. |
|
|
msg153624 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2012-02-18 01:49 |
This regex bug was actually already known. Please follow up on the other report. > As an aside, it seems that the failing test from issue 13193 was > actually correct, and that the library itself was broken. Yes. filelist operates in this way: it builds a list of all files in a tree (creating allfiles, which thus uses native path separators), then edits this list according to the MANIFEST.in commands (where the patterns must be transformed to a regex that uses native separators). So the test was right in using os.path.join, and the regex was not right to always use '/'. |
|
|
msg154258 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-02-25 15:14 |
New changeset 47788c90f80b by Éric Araujo in branch '2.7': Fix long-standing bugs with MANIFEST.in parsing on Windows (#6884). http://hg.python.org/cpython/rev/47788c90f80b |
|
|
msg154262 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-02-25 15:32 |
New changeset 73aa4c9305b3 by Éric Araujo in branch '3.2': Fix long-standing bugs with MANIFEST.in parsing on Windows (#6884). http://hg.python.org/cpython/rev/73aa4c9305b3 |
|
|
msg154457 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-02-27 11:29 |
New changeset 4d6a9f287543 by Éric Araujo in branch 'default': Fix bugs with MANIFEST.in parsing on Windows (#6884). http://hg.python.org/distutils2/rev/4d6a9f287543 |
|
|
msg156266 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-03-18 19:40 |
New changeset edcdef70c44e by Éric Araujo in branch '3.2': Fix long-standing bugs with MANIFEST.in parsing on Windows (#6884). http://hg.python.org/cpython/rev/edcdef70c44e |
|
|