msg118804 - (view) |
Author: Alexander Belopolsky (belopolsky) *  |
Date: 2010-10-15 16:54 |
Tools/scripts/reindent.py -d Lib/test/encoded_modules/module_koi8_r.py Traceback (most recent call last): File "Tools/scripts/reindent.py", line 310, in main() File "Tools/scripts/reindent.py", line 93, in main check(arg) File "Tools/scripts/reindent.py", line 114, in check r = Reindenter(f) File "Tools/scripts/reindent.py", line 162, in __init__ self.raw = f.readlines() File "Lib/codecs.py", line 300, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf8' codec can't decode byte 0xf0 in position 59: invalid continuation byte Attached patch fixes this issue. |
|
|
msg118810 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2010-10-15 17:45 |
+1. |
|
|
msg118812 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2010-10-15 17:53 |
LGTM. |
|
|
msg119026 - (view) |
Author: Alexander Belopolsky (belopolsky) *  |
Date: 2010-10-18 14:48 |
Committed in r85695. Leaving open to discuss whether anything can/should be done for the case when reindent acts as an stdin to stdout filter. Also, what is the policy on backporting Tools' bug fixes? |
|
|
msg119276 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2010-10-21 11:44 |
When working as a filter, reindent should use sys.{stdin,stdout}.encoding (defaulting to sys.getdefaultencoding()) for reading and writing, respectively. Detecting encoding on streams is not worth it IMO. People can set PYTHONIOENCODING for baroque needs. |
|
|
msg139967 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2011-07-07 10:50 |
> Leaving open to discuss whether anything can/should be done > for the case when reindent acts as an stdin sys.stdin.buffer and sys.stdout.buffer should be used with tokenize.detect_encoding(). We may read first stdin and write it into a BytesIO object to be able to rewind after detect_encoding. Something like: content = sys.stdin.buffer.read() raw = io.BytesIO(content) buffer = io.BufferedReader(raw) encoding, _ = detect_encoding(buffer.readline) buffer.seek(0) text = TextIOWrapper(buffer, encoding) # use text |
|
|
msg140001 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2011-07-07 23:25 |
reindent_coding.py: patch fixing reindent.py when using pipes (stdin and stdout). |
|
|
msg140003 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2011-07-07 23:43 |
This is a lot more code than what I’d have expected. What is your opinion on my previous message? |
|
|
msg140005 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2011-07-07 23:47 |
> When working as a filter, reindent should use sys.{stdin,stdout}.encoding > (defaulting to sys.getdefaultencoding()) for reading and writing, > respectively. It just doesn't work: you cannot read a ISO-8859-1 file from UTF-8 (if your locale encoding is UTF-8). |
|
|
msg140021 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2011-07-08 11:19 |
Even with PYTHONIOENCODING? |
|
|
msg315607 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2018-04-22 11:15 |
I concur with Éric. Standard input and output are text streams in Python 3. The user can control their encoding by setting locale or PYTHONIOENCODING. I think this issue can be closed now unless somebody want to backport the fix to 2.7. |
|
|
msg377111 - (view) |
Author: Irit Katriel (iritkatriel) *  |
Date: 2020-09-18 12:04 |
Since there won't be a python 2.7 backport, should this issue be closed? |
|
|
msg377114 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2020-09-18 12:58 |
> Committed in r85695. Leaving open to discuss whether anything can/should be done for the case when reindent acts as an stdin to stdout filter. Also, what is the policy on backporting Tools' bug fixes? This is the commit: commit 4a98e3b6d06e5477e5d62f18e85056cbb7253f98 Author: Alexander Belopolsky <alexander.belopolsky@gmail.com> Date: Mon Oct 18 14:43:38 2010 +0000 Issue #10117: Tools/scripts/reindent.py now accepts source files that use encoding other than ASCII or UTF-8. Source encoding is preserved when reindented code is written to a file. > Since there won't be a python 2.7 backport, should this issue be closed? Right, 2.7 branch is closed. I close the issue. |
|
|