[Python-Dev] Where are universal newlines handled in the parser/compiler? (original) (raw)

Benjamin Peterson musiccomposition at gmail.com
Sun Aug 17 03:39:50 CEST 2008


On Sat, Aug 16, 2008 at 8:34 PM, Brett Cannon <brett at python.org> wrote:

If you import a module that uses '\r\n' line endings, Python does the right thing. But if you read in the bytes for the same file and then pass it to compile() you get an unhelpful syntax error pointing at a blank line.

Normally I would say one should just open the source file as 'r' instead of 'rb', but with source code that does not work well as their can be a source encoding set. Lib/test/testpep263.py is the perfect example of this; Windows newlines with a koi8-r encoding. What I would like to do is get compile() to work properly with a bytes stream just as if Python itself was handling the compilation through import and from a file directly. But before I try to dig into the parser to figure out where the translation of newlines occurs (or where the translation option is set), I thought I would ask to see if anyone just happened to know (I have already spent a few hours figuring out why Latin-1 encodings were not working with compile() so I don't want to go diving into the maze of function calls in the parser again).

Have a look at tok_nextc in Parser/tokenizer.c.

-Brett


Python-Dev mailing list Python-Dev at python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/musiccomposition%40gmail.com

-- Cheers, Benjamin Peterson "There's no place like 127.0.0.1."



More information about the Python-Dev mailing list