[Python-Dev] [python] Re: New lines, carriage returns, and Windows (original) (raw)
Terry Reedy tjreedy at udel.edu
Sat Sep 29 20:30:59 CEST 2007
- Previous message: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
- Next message: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
"Michael Foord" <fuzzyman at voidspace.org.uk> wrote in message news:46FE6F92.40601 at voidspace.org.uk... | Guido van Rossum wrote:
[snip first part of nice summary of Python i/o model]
| > The other translation deals with line endings. Upon input, any of | > \r\n, \r, or \n is translated to a single \n by default (this is nhe [sic] | > "universal newlines" algorithm from Python 2.x). This can be tweaked | > or disabled. Upon output, \n is translated into a platform specific | > string chosen from \r\n, \r, or \n. This can also be disabled or | > overridden. Note that \r, when written, is never treated specially; if | > you want special processing for \r on output, you can write your own | > translation layer.
| So the question is, that when a string containing '\r\n' is written to a | file in text mode on a Windows platform, should it be written with the | encoded representation of '\r\n' or '\r\r\n'?
I think Guido pretty clearly said that on output, the default behavior is that \r is nothing special. If you want a special case exception, write a special case translator. +1 from me.
To propose otherwise is to propose that the default semantic meaning of Python text objects depend on the platform that it might be output-translated for. I believe the point of universal newline support was to get away from this.
| Purity would dictate the latter and practicality the former (IMO)...
I disagree. Special case exceptions complicate both learnability and code readability and maintainability. Simplicity is practicality. The symmetry of 'platform-line-endings =input> \n =output> plaform-line-endings' is both pure and practical.
| However, that would mean that round tripping a string would change it | ('\r\n' would be written as '\r\n' and then read as '\n')
Whereas \r\r\n would be read back as \r\n, which is what should happen. Round-trip-ability is practical to me.
| - on the other | hand (particularly given that we are treating the data as text and not a | binary blob) I don't see how writing '\r\r\n' would ever actually be | useful in text.
There are two normal ways for internal Python text to have \r\n:
- Read from a file with \r\r\n. Then \r\r\n is correct output (on the same platform).
- Intentially put there by a programmer. If s/he also chooses default \n translation on output, \r<translation of \n> is correct.
The leaves
- Bugs due to ignorance or accident. These should be repaired.
- Other special situations, which can be handled by disabling, overriding, and layering the defaults. This seems enough flexibility to me.
Terry Jan Reedy
- Previous message: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
- Next message: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]