Issue 10639: reindent.py should not convert newlines (original) (raw)

Created on 2010-12-06 16:36 by jaraco, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (34)

msg123476 - (view)

Author: Jason R. Coombs (jaraco) * (Python committer)

Date: 2010-12-06 16:36

When reindent.py runs, it will convert the line endings for each file it converts to the default line ending for the platform on which reindent.py runs. It would be better if reindent.py would retain line endings of the source file. Attached is a patch that addresses this issue.

msg123477 - (view)

Author: Jason R. Coombs (jaraco) * (Python committer)

Date: 2010-12-06 16:42

Merged the patch with the latest trunk.

msg123478 - (view)

Author: Éric Araujo (eric.araujo) * (Python committer)

Date: 2010-12-06 16:51

Looks good.

msg124369 - (view)

Author: Éric Araujo (eric.araujo) * (Python committer)

Date: 2010-12-19 23:43

If nobody objects, I will commit this when py3k is unfrozen.

msg125818 - (view)

Author: Éric Araujo (eric.araujo) * (Python committer)

Date: 2011-01-09 01:11

I did a bit of testing on demo files. The first one I tested with had mixed EOLs, in which case the newlines attribute will be a tuple and open(..., newline=r.newlines) will fail. For reference: http://docs.python.org/dev/library/io#io.TextIOBase.newlines

What would be the best behavior in such cases? Refuse the temptation to guess and exit?

msg126915 - (view)

Author: Senthil Kumaran (orsenthil) * (Python committer)

Date: 2011-01-24 03:21

Why is reindent.py translating the whole file to platform default newline such a bad thing? Could not it be considered as helpful behavior, especially in the case where the user has mixed EOLs by mistake?

msg127449 - (view)

Author: Éric Araujo (eric.araujo) * (Python committer)

Date: 2011-01-29 17:38

I’m not opposed to reindent normalizing EOLs, given that our VCS hooks will translate them to the right thing (for example, making sure that files specific to the Windows build use CRLF, not LF, even if edited on POSIX).

msg127481 - (view)

Author: Jason R. Coombs (jaraco) * (Python committer)

Date: 2011-01-29 20:04

That's great that reindent works in your environment, but that's no help in my environment.

Here's the use case: We have a software package developed in Unix and with Unix LF line endings. The code base may also contain files with CRLF line endings (I can't be certain).

I'm developing in Windows (which includes reindent.py installed by default). I'd like to use that tool to reindent the code base without munging the line endings. I can't do this without first modifying reindent.py or using another tool to convert the line endings afterword (at which point I don't know if the Windows line endings were created by reindent.py or were there natively).

If someone wants the line endings normalized on their code base, they should use another tool (or possibly an option to reindent.py), but there's definitely a case for using reindent.py without altering the line endings.

msg127482 - (view)

Author: Éric Araujo (eric.araujo) * (Python committer)

Date: 2011-01-29 20:09

That is a valid use case, so +1 on not changing behavior and adding a new option in 3.3 (in another bug report).

What is your reply to ?

msg136030 - (view)

Author: Jason R. Coombs (jaraco) * (Python committer)

Date: 2011-05-15 14:26

Re , I agree that the mixed EOLs is also a potential problem, though the proposed solution behaves better than the status quo, where the EOLs get converted without warning.

It would be desirable for the patch to be more robust in that situation (providing a useful error message about why it fails). What might even be better is to retain newlines even if they're mixed. I'll explore a cross-platform technique to consistently retain newlines as encountered.

msg141092 - (view)

Author: Jason R. Coombs (jaraco) * (Python committer)

Date: 2011-07-25 13:56

I looked into the possibility of retaining newline characters even for files with mixed newlines, but I've decided that's too intrusive an approach. The current implementation specifically takes measures to strip whitespaces from the ends of lines, so it seems outside the scope of this bug to alter that behavior.

So I've taken another stab at a more robust implementation that does newline detection but raises an error if the file contains mixed newlines.

Furthermore, it adds an option (--newline) to specify which newline character to use, bypassing the mixed-newline error and allowing the user to override newline detection.

Here's a demo run:

PS C:\cpython-issue10639> python .\Tools\scripts\reindent.py .\foo.py
.\foo.py: mixed newlines detected; cannot continue without --newline
PS C:\cpython-issue10639> python .\Tools\scripts\reindent.py --newline CRLF .\foo.py
PS C:\cpython-issue10639> python .\Tools\scripts\reindent.py --newline LF .\foo.py
PS C:\cpython-issue10639> python .\Tools\scripts\reindent.py .\foo.py

I've published this change as https://bitbucket.org/jaraco/cpython-issue10639/changeset/900df5732f93.

Please review. If this changeset is acceptable, I will push the revisions to the master repo. Please advise if I may also backport to Python 3.2 and 2.7.

msg141151 - (view)

Author: Éric Araujo (eric.araujo) * (Python committer)

Date: 2011-07-26 12:33

Thanks for your work. The generated patch is for some reason unreadable, so I looked at the changeset on Bitbucket. It looks good, except that a new option would have to be 3.3-only, whereas the bug could be fixed in 2.7 and 3.2 too. If it’s too bothersome for you to split your current patch, then we can fix in 3.3 only.

msg141153 - (view)

Author: Jason R. Coombs (jaraco) * (Python committer)

Date: 2011-07-26 12:41

No problem. I'll rebase and push the current patch as-is, and then backport to 3.2 and 2.7 without the option (just raising the error when mixed newlines are encountered).

msg141154 - (view)

Author: Éric Araujo (eric.araujo) * (Python committer)

Date: 2011-07-26 12:46

If this changeset is acceptable, I will push the revisions to the master repo.

I'll rebase and push the current patch as-is,

What repo are you talking about, BTW? The developers list in the devguide does not contain your name, so I assume you’re not talking about hg.python.org/cpython?

and then backport to 3.2 and 2.7 without the option IMO it would be easier to make one bugfix changeset in 3.2, merge it in 3.3, and then add the new option.

msg141159 - (view)

Author: Jason R. Coombs (jaraco) * (Python committer)

Date: 2011-07-26 14:16

I assume you’re not talking about hg.python.org/cpython?

My name appears as jason.coombs. But your note does remind me that I need to review the devguide, since it's been a while since I've pushed something. I may also look into linking/merging my two accounts.

I plan to do as you suggest, splitting the patch into the bugfix and the new capability.

msg141160 - (view)

Author: Éric Araujo (eric.araujo) * (Python committer)

Date: 2011-07-26 14:24

Okay, I hadn’t seen you in http://docs.python.org/devguide/developers and I don’t recall the URI of the generated file with all names. I’ve asked that your Roundup profile be updated so that you get the Python icon and the possibility to be assigned bugs.

If you already have clones, the workflow should not be hard: go to a 3.2 checkout, fix things, run patchcheck and tests, commit. Go to 3.3, pull, merge, test, commit, push. The only tricky things that can make push hooks reject changesets are trailing whitespace, maybe EOLs, and new named branches. We also prefer to push one changeset with all changes, not a series of changesets with gradual fixes.

msg141163 - (view)

Author: Roundup Robot (python-dev) (Python triager)

Date: 2011-07-26 15:59

New changeset 070dc6e359fb by Jason R. Coombs in branch '3.2': Fixes #10639: reindent.py should not convert newlines http://hg.python.org/cpython/rev/070dc6e359fb

New changeset 826a0208d143 by Jason R. Coombs in branch 'default': Merge with 3.2 Issue #10639: reindent.py should not convert newlines. http://hg.python.org/cpython/rev/826a0208d143

New changeset 4feb889d3bed by Jason R. Coombs in branch 'default': Issue #10639: reindent.py tool now accepts a --newline option to specify the newline to be used in the output of converted files. http://hg.python.org/cpython/rev/4feb889d3bed

msg141164 - (view)

Author: Jason R. Coombs (jaraco) * (Python committer)

Date: 2011-07-26 16:01

I've stripped the undesirable revisions and updated the bitbucket repo so it now contains three changesets for Python 3.2 and 3.3 and suggested.

I don't believe running the test suite is relevant, as I grepped the test suite for reindent, and there's no reference to the tool script.

It took several rounds of rebasing and stripping and importing patches to get everything right, but I believe these changes meet the desired goals.

msg141165 - (view)

Author: Éric Araujo (eric.araujo) * (Python committer)

Date: 2011-07-26 16:09

I've stripped the undesirable revisions and updated the bitbucket repo so it now contains three changesets for Python 3.2 and 3.3 and suggested.

hg diff combined with hg import can help you flatten a series of changeset into one.

I don't believe running the test suite is relevant You’re right, I mentioned that step because I was talking about the workflow in general, but here it does not apply, we only have manual testing for most tools.

I believe these changes meet the desired goals. Me too! Are you comfortable with how to backport this to 2.7?

msg141171 - (view)

Author: Jason R. Coombs (jaraco) * (Python committer)

Date: 2011-07-26 16:59

What do you think about this: https://bitbucket.org/jaraco/cpython-issue10639/changeset/ea63310dddce

The patch is a little more intrusive than the Python 3 patch because Python 2.7 doesn't allow specifying the newline to use when writing a file (afaict), but it abstracts that intrusiveness in the NewlineAdapter class.

I've tested the script on a file with mixed newlines, a file with non-native newlines, and a file with content but no newlines, and it performs as expected.

msg141233 - (view)

Author: Éric Araujo (eric.araujo) * (Python committer)

Date: 2011-07-27 15:24

The patch is a little more intrusive than the Python 3 patch because Python 2.7 doesn't allow specifying the newline to use when writing a file (afaict)

Instead of writing a new class, what about using io.open instead of the builtin? Then you’ll be able to use the same patch than 3.2.

msg141239 - (view)

Author: Jason R. Coombs (jaraco) * (Python committer)

Date: 2011-07-27 16:16

Instead of writing a new class, what about using io.open instead of the builtin? Then you’ll be able to use the same patch than 3.2.

Ooh. Excellent suggestion.

msg141261 - (view)

Author: Roundup Robot (python-dev) (Python triager)

Date: 2011-07-27 18:41

New changeset f9cf55bbe9b9 by Jason R. Coombs in branch '2.7': Fixes #10639: reindent.py should not convert newlines http://hg.python.org/cpython/rev/f9cf55bbe9b9

msg141325 - (view)

Author: Eli Bendersky (eli.bendersky) * (Python committer)

Date: 2011-07-29 03:55

It appears this breaks "make patchcheck" in trunk:

./python ./Tools/scripts/patchcheck.py Getting the list of files that have been added/changed ... 5 files Fixing whitespace ... Traceback (most recent call last): File "./Tools/scripts/patchcheck.py", line 147, in main() File "./Tools/scripts/patchcheck.py", line 129, in main normalize_whitespace(python_files) File "./Tools/scripts/patchcheck.py", line 22, in call_fxn result = fxn(*args, **kwargs) File "./Tools/scripts/patchcheck.py", line 66, in normalize_whitespace fixed = [path for path in file_paths if path.endswith('.py') and File "./Tools/scripts/patchcheck.py", line 67, in reindent.check(path)] File "/home/eliben/python-src/33/Tools/scripts/reindent.py", line 129, in check newline = spec_newline if spec_newline else r.newlines NameError: global name 'spec_newline' is not defined make: *** [patchcheck] Error 1

msg141328 - (view)

Author: Eli Bendersky (eli.bendersky) * (Python committer)

Date: 2011-07-29 04:34

This is a good example of why passing parameters into functions by means of globals sucks. In reindent.py, main() sets the spec_newline global which check() uses, but this was forgotten in patchcheck.py which also uses reindent.check()

IMHO reindent.py should be rewritten to ditch all globals

msg141361 - (view)

Author: Éric Araujo (eric.araujo) * (Python committer)

Date: 2011-07-29 12:42

I agree that using globals instead function arguments sucks, but let’s fix this bug without rewriting the whole file (unless this affects only the code new in 3.3).

msg141370 - (view)

Author: Jason R. Coombs (jaraco) * (Python committer)

Date: 2011-07-29 13:16

The 'spec_newline' in particular is only in the trunk (not in the backports), as it's part of the new --newline option.

msg141371 - (view)

Author: Eli Bendersky (eli.bendersky) * (Python committer)

Date: 2011-07-29 13:18

Jason, one way or another, a prompt fix for trunk is required, since make patchcheck is an important step for committing patches.

msg141373 - (view)

Author: Roundup Robot (python-dev) (Python triager)

Date: 2011-07-29 13:33

New changeset 2547f7965733 by Jason R. Coombs in branch 'default': Issue #10639: spec_newline wasn't defined globally unless main() was called; now spec_newline is set at module import/execution http://hg.python.org/cpython/rev/2547f7965733

msg141375 - (view)

Author: Jason R. Coombs (jaraco) * (Python committer)

Date: 2011-07-29 13:36

I hadn't realized that the other global variables were defined at the module level or I would have implemented this originally. Please let me know if this patch doesn't correct the issue.

msg141418 - (view)

Author: Eli Bendersky (eli.bendersky) * (Python committer)

Date: 2011-07-30 03:52

"make patchcheck" is working again. Thanks

msg141547 - (view)

Author: Scott Dial (scott.dial)

Date: 2011-08-02 07:08

I haven't seen anyone use a side-effect-less statement (a string) as a comment before, but I doubt that is an approved style for the CPython codebase. Please change the string preceeding the spec_line definition into a proper comment.

msg141558 - (view)

Author: R. David Murray (r.david.murray) * (Python committer)

Date: 2011-08-02 10:58

Well, this is actually blessed by http://www.python.org/dev/peps/pep-0257/, but if that convention were actually followed the docstring would go after the assignment. But I agree that it is rarely used, and as far as I know is not used in the stdlib at all.

msg141563 - (view)

Author: Roundup Robot (python-dev) (Python triager)

Date: 2011-08-02 12:20

New changeset dc96af0e7f60 by Jason R. Coombs in branch 'default': Corrected attribute docstring per pep-257 (reference #10639) http://hg.python.org/cpython/rev/dc96af0e7f60

History

Date

User

Action

Args

2022-04-11 14:57:09

admin

set

github: 54848

2011-08-02 12:20:52

python-dev

set

messages: +

2011-08-02 10:58:34

r.david.murray

set

nosy: + r.david.murray
messages: +

2011-08-02 07:08:19

scott.dial

set

nosy: + scott.dial
messages: +

2011-07-30 13:32:29

eric.araujo

set

files: - unnamed

2011-07-30 03:52:29

eli.bendersky

set

files: + unnamed

messages: +

2011-07-29 13:38:51

jaraco

set

status: open -> closed
stage: needs patch -> commit review

2011-07-29 13:36:52

jaraco

set

messages: +

2011-07-29 13:33:49

python-dev

set

messages: +

2011-07-29 13🔞54

eli.bendersky

set

messages: +

2011-07-29 13:16:56

jaraco

set

messages: +

2011-07-29 12:42:43

eric.araujo

set

messages: +

2011-07-29 04:34:36

eli.bendersky

set

messages: +

2011-07-29 04:31:42

eli.bendersky

set

priority: normal -> high
stage: resolved -> needs patch

2011-07-29 03:55:43

eli.bendersky

set

status: closed -> open
nosy: + eli.bendersky
messages: +

2011-07-28 13:44:21

eric.araujo

set

stage: commit review -> resolved

2011-07-27 18:42:24

jaraco

set

status: open -> closed
resolution: fixed

2011-07-27 18:41:10

python-dev

set

messages: +

2011-07-27 16:16:28

jaraco

set

messages: +

2011-07-27 15:24:21

eric.araujo

set

messages: +

2011-07-26 20:34:12

pitrou

set

nosy: - pitrou

2011-07-26 16:59:41

jaraco

set

messages: +

2011-07-26 16:09:44

eric.araujo

set

messages: +

2011-07-26 16:01:07

jaraco

set

messages: +

2011-07-26 15:59:25

python-dev

set

nosy: + python-dev
messages: +

2011-07-26 15:41:24

eric.araujo

set

assignee: eric.araujo -> jaraco
stage: patch review -> commit review

2011-07-26 14:24:59

eric.araujo

set

messages: +

2011-07-26 14:16:54

jaraco

set

messages: +

2011-07-26 12:46:37

eric.araujo

set

messages: +

2011-07-26 12:41:58

jaraco

set

messages: +

2011-07-26 12:40:34

ezio.melotti

set

files: + 900df5732f93.diff

2011-07-26 12:40:09

ezio.melotti

set

files: - 900df5732f93.diff

2011-07-26 12:33:43

eric.araujo

set

title: reindent.py converts newlines to platform default -> reindent.py should not convert newlines
messages: +
versions: + Python 3.3, - Python 3.1

2011-07-26 12:26:38

eric.araujo

set

files: + 900df5732f93.diff

2011-07-25 13:56:41

jaraco

set

hgrepos: + hgrepo45
messages: +

2011-05-15 14:26:37

jaraco

set

messages: +

2011-01-29 20:09:15

eric.araujo

set

nosy:jaraco, orsenthil, pitrou, eric.araujo
messages: +

2011-01-29 20:04:40

jaraco

set

nosy:jaraco, orsenthil, pitrou, eric.araujo
messages: +

2011-01-29 17:38:11

eric.araujo

set

nosy:jaraco, orsenthil, pitrou, eric.araujo
messages: +

2011-01-24 03:21:48

orsenthil

set

nosy: + orsenthil
messages: +

2011-01-09 01:11:17

eric.araujo

set

status: pending -> open
nosy: + pitrou
messages: +

2010-12-19 23:43:03

eric.araujo

set

status: open -> pending

messages: +
assignee: eric.araujo

2010-12-06 16:51:03

eric.araujo

set

versions: + Python 2.7
nosy: + eric.araujo

messages: +

type: behavior
stage: patch review

2010-12-06 16:42:42

jaraco

set

files: + reindent-autonewline.patch

messages: +

2010-12-06 16:40:38

jaraco

set

files: - reindent-autonewline.patch

2010-12-06 16:36:51

jaraco

create