Issue 10639: reindent.py should not convert newlines (original) (raw)
Created on 2010-12-06 16:36 by jaraco, last changed 2022-04-11 14:57 by admin. This issue is now closed.
Messages (34)
Author: Jason R. Coombs (jaraco) *
Date: 2010-12-06 16:36
When reindent.py runs, it will convert the line endings for each file it converts to the default line ending for the platform on which reindent.py runs. It would be better if reindent.py would retain line endings of the source file. Attached is a patch that addresses this issue.
Author: Jason R. Coombs (jaraco) *
Date: 2010-12-06 16:42
Merged the patch with the latest trunk.
Author: Éric Araujo (eric.araujo) *
Date: 2010-12-06 16:51
Looks good.
Author: Éric Araujo (eric.araujo) *
Date: 2010-12-19 23:43
If nobody objects, I will commit this when py3k is unfrozen.
Author: Éric Araujo (eric.araujo) *
Date: 2011-01-09 01:11
I did a bit of testing on demo files. The first one I tested with had mixed EOLs, in which case the newlines attribute will be a tuple and open(..., newline=r.newlines) will fail. For reference: http://docs.python.org/dev/library/io#io.TextIOBase.newlines
What would be the best behavior in such cases? Refuse the temptation to guess and exit?
Author: Senthil Kumaran (orsenthil) *
Date: 2011-01-24 03:21
Why is reindent.py translating the whole file to platform default newline such a bad thing? Could not it be considered as helpful behavior, especially in the case where the user has mixed EOLs by mistake?
Author: Éric Araujo (eric.araujo) *
Date: 2011-01-29 17:38
I’m not opposed to reindent normalizing EOLs, given that our VCS hooks will translate them to the right thing (for example, making sure that files specific to the Windows build use CRLF, not LF, even if edited on POSIX).
Author: Jason R. Coombs (jaraco) *
Date: 2011-01-29 20:04
That's great that reindent works in your environment, but that's no help in my environment.
Here's the use case: We have a software package developed in Unix and with Unix LF line endings. The code base may also contain files with CRLF line endings (I can't be certain).
I'm developing in Windows (which includes reindent.py installed by default). I'd like to use that tool to reindent the code base without munging the line endings. I can't do this without first modifying reindent.py or using another tool to convert the line endings afterword (at which point I don't know if the Windows line endings were created by reindent.py or were there natively).
If someone wants the line endings normalized on their code base, they should use another tool (or possibly an option to reindent.py), but there's definitely a case for using reindent.py without altering the line endings.
Author: Éric Araujo (eric.araujo) *
Date: 2011-01-29 20:09
That is a valid use case, so +1 on not changing behavior and adding a new option in 3.3 (in another bug report).
What is your reply to ?
Author: Jason R. Coombs (jaraco) *
Date: 2011-05-15 14:26
Re , I agree that the mixed EOLs is also a potential problem, though the proposed solution behaves better than the status quo, where the EOLs get converted without warning.
It would be desirable for the patch to be more robust in that situation (providing a useful error message about why it fails). What might even be better is to retain newlines even if they're mixed. I'll explore a cross-platform technique to consistently retain newlines as encountered.
Author: Jason R. Coombs (jaraco) *
Date: 2011-07-25 13:56
I looked into the possibility of retaining newline characters even for files with mixed newlines, but I've decided that's too intrusive an approach. The current implementation specifically takes measures to strip whitespaces from the ends of lines, so it seems outside the scope of this bug to alter that behavior.
So I've taken another stab at a more robust implementation that does newline detection but raises an error if the file contains mixed newlines.
Furthermore, it adds an option (--newline) to specify which newline character to use, bypassing the mixed-newline error and allowing the user to override newline detection.
Here's a demo run:
PS C:\cpython-issue10639> python .\Tools\scripts\reindent.py .\foo.py
.\foo.py: mixed newlines detected; cannot continue without --newline
PS C:\cpython-issue10639> python .\Tools\scripts\reindent.py --newline CRLF .\foo.py
PS C:\cpython-issue10639> python .\Tools\scripts\reindent.py --newline LF .\foo.py
PS C:\cpython-issue10639> python .\Tools\scripts\reindent.py .\foo.py
I've published this change as https://bitbucket.org/jaraco/cpython-issue10639/changeset/900df5732f93.
Please review. If this changeset is acceptable, I will push the revisions to the master repo. Please advise if I may also backport to Python 3.2 and 2.7.
Author: Éric Araujo (eric.araujo) *
Date: 2011-07-26 12:33
Thanks for your work. The generated patch is for some reason unreadable, so I looked at the changeset on Bitbucket. It looks good, except that a new option would have to be 3.3-only, whereas the bug could be fixed in 2.7 and 3.2 too. If it’s too bothersome for you to split your current patch, then we can fix in 3.3 only.
Author: Jason R. Coombs (jaraco) *
Date: 2011-07-26 12:41
No problem. I'll rebase and push the current patch as-is, and then backport to 3.2 and 2.7 without the option (just raising the error when mixed newlines are encountered).
Author: Éric Araujo (eric.araujo) *
Date: 2011-07-26 12:46
If this changeset is acceptable, I will push the revisions to the master repo.
I'll rebase and push the current patch as-is,
What repo are you talking about, BTW? The developers list in the devguide does not contain your name, so I assume you’re not talking about hg.python.org/cpython?
and then backport to 3.2 and 2.7 without the option IMO it would be easier to make one bugfix changeset in 3.2, merge it in 3.3, and then add the new option.
Author: Jason R. Coombs (jaraco) *
Date: 2011-07-26 14:16
I assume you’re not talking about hg.python.org/cpython?
My name appears as jason.coombs. But your note does remind me that I need to review the devguide, since it's been a while since I've pushed something. I may also look into linking/merging my two accounts.
I plan to do as you suggest, splitting the patch into the bugfix and the new capability.
Author: Éric Araujo (eric.araujo) *
Date: 2011-07-26 14:24
Okay, I hadn’t seen you in http://docs.python.org/devguide/developers and I don’t recall the URI of the generated file with all names. I’ve asked that your Roundup profile be updated so that you get the Python icon and the possibility to be assigned bugs.
If you already have clones, the workflow should not be hard: go to a 3.2 checkout, fix things, run patchcheck and tests, commit. Go to 3.3, pull, merge, test, commit, push. The only tricky things that can make push hooks reject changesets are trailing whitespace, maybe EOLs, and new named branches. We also prefer to push one changeset with all changes, not a series of changesets with gradual fixes.
Author: Roundup Robot (python-dev)
Date: 2011-07-26 15:59
New changeset 070dc6e359fb by Jason R. Coombs in branch '3.2': Fixes #10639: reindent.py should not convert newlines http://hg.python.org/cpython/rev/070dc6e359fb
New changeset 826a0208d143 by Jason R. Coombs in branch 'default': Merge with 3.2 Issue #10639: reindent.py should not convert newlines. http://hg.python.org/cpython/rev/826a0208d143
New changeset 4feb889d3bed by Jason R. Coombs in branch 'default': Issue #10639: reindent.py tool now accepts a --newline option to specify the newline to be used in the output of converted files. http://hg.python.org/cpython/rev/4feb889d3bed
Author: Jason R. Coombs (jaraco) *
Date: 2011-07-26 16:01
I've stripped the undesirable revisions and updated the bitbucket repo so it now contains three changesets for Python 3.2 and 3.3 and suggested.
I don't believe running the test suite is relevant, as I grepped the test suite for reindent, and there's no reference to the tool script.
It took several rounds of rebasing and stripping and importing patches to get everything right, but I believe these changes meet the desired goals.
Author: Éric Araujo (eric.araujo) *
Date: 2011-07-26 16:09
I've stripped the undesirable revisions and updated the bitbucket repo so it now contains three changesets for Python 3.2 and 3.3 and suggested.
hg diff combined with hg import can help you flatten a series of changeset into one.
I don't believe running the test suite is relevant You’re right, I mentioned that step because I was talking about the workflow in general, but here it does not apply, we only have manual testing for most tools.
I believe these changes meet the desired goals. Me too! Are you comfortable with how to backport this to 2.7?
Author: Jason R. Coombs (jaraco) *
Date: 2011-07-26 16:59
What do you think about this: https://bitbucket.org/jaraco/cpython-issue10639/changeset/ea63310dddce
The patch is a little more intrusive than the Python 3 patch because Python 2.7 doesn't allow specifying the newline to use when writing a file (afaict), but it abstracts that intrusiveness in the NewlineAdapter class.
I've tested the script on a file with mixed newlines, a file with non-native newlines, and a file with content but no newlines, and it performs as expected.
Author: Éric Araujo (eric.araujo) *
Date: 2011-07-27 15:24
The patch is a little more intrusive than the Python 3 patch because Python 2.7 doesn't allow specifying the newline to use when writing a file (afaict)
Instead of writing a new class, what about using io.open instead of the builtin? Then you’ll be able to use the same patch than 3.2.
Author: Jason R. Coombs (jaraco) *
Date: 2011-07-27 16:16
Instead of writing a new class, what about using io.open instead of the builtin? Then you’ll be able to use the same patch than 3.2.
Ooh. Excellent suggestion.
Author: Roundup Robot (python-dev)
Date: 2011-07-27 18:41
New changeset f9cf55bbe9b9 by Jason R. Coombs in branch '2.7': Fixes #10639: reindent.py should not convert newlines http://hg.python.org/cpython/rev/f9cf55bbe9b9
Author: Eli Bendersky (eli.bendersky) *
Date: 2011-07-29 03:55
It appears this breaks "make patchcheck" in trunk:
./python ./Tools/scripts/patchcheck.py Getting the list of files that have been added/changed ... 5 files Fixing whitespace ... Traceback (most recent call last): File "./Tools/scripts/patchcheck.py", line 147, in main() File "./Tools/scripts/patchcheck.py", line 129, in main normalize_whitespace(python_files) File "./Tools/scripts/patchcheck.py", line 22, in call_fxn result = fxn(*args, **kwargs) File "./Tools/scripts/patchcheck.py", line 66, in normalize_whitespace fixed = [path for path in file_paths if path.endswith('.py') and File "./Tools/scripts/patchcheck.py", line 67, in reindent.check(path)] File "/home/eliben/python-src/33/Tools/scripts/reindent.py", line 129, in check newline = spec_newline if spec_newline else r.newlines NameError: global name 'spec_newline' is not defined make: *** [patchcheck] Error 1
Author: Eli Bendersky (eli.bendersky) *
Date: 2011-07-29 04:34
This is a good example of why passing parameters into functions by means of globals sucks. In reindent.py, main() sets the spec_newline global which check() uses, but this was forgotten in patchcheck.py which also uses reindent.check()
IMHO reindent.py should be rewritten to ditch all globals
Author: Éric Araujo (eric.araujo) *
Date: 2011-07-29 12:42
I agree that using globals instead function arguments sucks, but let’s fix this bug without rewriting the whole file (unless this affects only the code new in 3.3).
Author: Jason R. Coombs (jaraco) *
Date: 2011-07-29 13:16
The 'spec_newline' in particular is only in the trunk (not in the backports), as it's part of the new --newline option.
Author: Eli Bendersky (eli.bendersky) *
Date: 2011-07-29 13:18
Jason, one way or another, a prompt fix for trunk is required, since make patchcheck
is an important step for committing patches.
Author: Roundup Robot (python-dev)
Date: 2011-07-29 13:33
New changeset 2547f7965733 by Jason R. Coombs in branch 'default': Issue #10639: spec_newline wasn't defined globally unless main() was called; now spec_newline is set at module import/execution http://hg.python.org/cpython/rev/2547f7965733
Author: Jason R. Coombs (jaraco) *
Date: 2011-07-29 13:36
I hadn't realized that the other global variables were defined at the module level or I would have implemented this originally. Please let me know if this patch doesn't correct the issue.
Author: Eli Bendersky (eli.bendersky) *
Date: 2011-07-30 03:52
"make patchcheck" is working again. Thanks
Author: Scott Dial (scott.dial)
Date: 2011-08-02 07:08
I haven't seen anyone use a side-effect-less statement (a string) as a comment before, but I doubt that is an approved style for the CPython codebase. Please change the string preceeding the spec_line definition into a proper comment.
Author: R. David Murray (r.david.murray) *
Date: 2011-08-02 10:58
Well, this is actually blessed by http://www.python.org/dev/peps/pep-0257/, but if that convention were actually followed the docstring would go after the assignment. But I agree that it is rarely used, and as far as I know is not used in the stdlib at all.
Author: Roundup Robot (python-dev)
Date: 2011-08-02 12:20
New changeset dc96af0e7f60 by Jason R. Coombs in branch 'default': Corrected attribute docstring per pep-257 (reference #10639) http://hg.python.org/cpython/rev/dc96af0e7f60
History
Date
User
Action
Args
2022-04-11 14:57:09
admin
set
github: 54848
2011-08-02 12:20:52
python-dev
set
messages: +
2011-08-02 10:58:34
r.david.murray
set
nosy: + r.david.murray
messages: +
2011-08-02 07:08:19
scott.dial
set
nosy: + scott.dial
messages: +
2011-07-30 13:32:29
eric.araujo
set
files: - unnamed
2011-07-30 03:52:29
eli.bendersky
set
files: + unnamed
messages: +
2011-07-29 13:38:51
jaraco
set
status: open -> closed
stage: needs patch -> commit review
2011-07-29 13:36:52
jaraco
set
messages: +
2011-07-29 13:33:49
python-dev
set
messages: +
2011-07-29 13🔞54
eli.bendersky
set
messages: +
2011-07-29 13:16:56
jaraco
set
messages: +
2011-07-29 12:42:43
eric.araujo
set
messages: +
2011-07-29 04:34:36
eli.bendersky
set
messages: +
2011-07-29 04:31:42
eli.bendersky
set
priority: normal -> high
stage: resolved -> needs patch
2011-07-29 03:55:43
eli.bendersky
set
status: closed -> open
nosy: + eli.bendersky
messages: +
2011-07-28 13:44:21
eric.araujo
set
stage: commit review -> resolved
2011-07-27 18:42:24
jaraco
set
status: open -> closed
resolution: fixed
2011-07-27 18:41:10
python-dev
set
messages: +
2011-07-27 16:16:28
jaraco
set
messages: +
2011-07-27 15:24:21
eric.araujo
set
messages: +
2011-07-26 20:34:12
pitrou
set
nosy: - pitrou
2011-07-26 16:59:41
jaraco
set
messages: +
2011-07-26 16:09:44
eric.araujo
set
messages: +
2011-07-26 16:01:07
jaraco
set
messages: +
2011-07-26 15:59:25
python-dev
set
nosy: + python-dev
messages: +
2011-07-26 15:41:24
eric.araujo
set
assignee: eric.araujo -> jaraco
stage: patch review -> commit review
2011-07-26 14:24:59
eric.araujo
set
messages: +
2011-07-26 14:16:54
jaraco
set
messages: +
2011-07-26 12:46:37
eric.araujo
set
messages: +
2011-07-26 12:41:58
jaraco
set
messages: +
2011-07-26 12:40:34
ezio.melotti
set
files: + 900df5732f93.diff
2011-07-26 12:40:09
ezio.melotti
set
files: - 900df5732f93.diff
2011-07-26 12:33:43
eric.araujo
set
title: reindent.py converts newlines to platform default -> reindent.py should not convert newlines
messages: +
versions: + Python 3.3, - Python 3.1
2011-07-26 12:26:38
eric.araujo
set
files: + 900df5732f93.diff
2011-07-25 13:56:41
jaraco
set
hgrepos: + hgrepo45
messages: +
2011-05-15 14:26:37
jaraco
set
messages: +
2011-01-29 20:09:15
eric.araujo
set
nosy:jaraco, orsenthil, pitrou, eric.araujo
messages: +
2011-01-29 20:04:40
jaraco
set
nosy:jaraco, orsenthil, pitrou, eric.araujo
messages: +
2011-01-29 17:38:11
eric.araujo
set
nosy:jaraco, orsenthil, pitrou, eric.araujo
messages: +
2011-01-24 03:21:48
orsenthil
set
nosy: + orsenthil
messages: +
2011-01-09 01:11:17
eric.araujo
set
status: pending -> open
nosy: + pitrou
messages: +
2010-12-19 23:43:03
eric.araujo
set
status: open -> pending
messages: +
assignee: eric.araujo
2010-12-06 16:51:03
eric.araujo
set
versions: + Python 2.7
nosy: + eric.araujo
messages: +
type: behavior
stage: patch review
2010-12-06 16:42:42
jaraco
set
files: + reindent-autonewline.patch
messages: +
2010-12-06 16:40:38
jaraco
set
files: - reindent-autonewline.patch
2010-12-06 16:36:51
jaraco
create