msg8674 - (view) |
Author: David Bolen (db3l) * |
Date: 2002-01-10 04:01 |
If you have a module that you wish to compile using the builtin compile() function (in 'exec' mode), it will fail with a SyntaxError if that module does not have a newline as its final token. The same module can be executed directly by the interpreter, or imported by another module, and Python can properly compile and save a pyc for the module. I believe the difference is rooted in the fact that the tokenizer (tokenizer.c, in tok_nextc()) will "fake" a newline at the end of a file if it doesn't find one, but it will not do so when tokenizing a string buffer. What I'm not certain of is whether faking such a token for strings as well won't break something else (such as when parsing a string for an expression rather than a full module). But without such a change, you have a state where a module that works (and compiles) in other circumstances cannot be read into memory and compiled with the compile() builtin. This came up while tracking down a problem with failures using Gordan McMillan's Installer package which compiles modules using compile() before including them in the archive. I believe this is true for all releases since at least 1.5.2. -- David |
|
|
msg8675 - (view) |
Author: Neil Schemenauer (nascheme) *  |
Date: 2002-03-22 22:07 |
Logged In: YES user_id=35752 I ran into this bug myself when writing the PTL compiler. Here's a test case: code = "def foo():\n pass" open("bug.py", "w").write(code) import bug # works compile(code, "", "exec") # doesn't work I traced this bug to tok_nextc. If the input is coming from a file and the last bit of input doesn't end with a newline then one is faked. This doesn't happen if the input is coming from a string. I spent time trying to figure out how to fix it but the tok_nextc code is hairy and whatever I tried broke something else. |
|
|
msg8676 - (view) |
Author: Neil Schemenauer (nascheme) *  |
Date: 2002-03-22 22:07 |
Logged In: YES user_id=35752 I ran into this bug myself when writing the PTL compiler. Here's a test case: code = "def foo():\n pass" open("bug.py", "w").write(code) import bug # works compile(code, "", "exec") # doesn't work I traced this bug to tok_nextc. If the input is coming from a file and the last bit of input doesn't end with a newline then one is faked. This doesn't happen if the input is coming from a string. I spent time trying to figure out how to fix it but the tok_nextc code is hairy and whatever I tried broke something else. |
|
|
msg8677 - (view) |
Author: Guido van Rossum (gvanrossum) *  |
Date: 2002-03-22 22:20 |
Logged In: YES user_id=6380 > the tok_nextc code is hairy and whatever > I tried broke something else. That's exactly what happened to me when I tried to fix this myself long ago. :-( The workaround is simple enough: whoever calls compile() should append a newline to the string just to be sure. |
|
|
msg8678 - (view) |
Author: Tim Peters (tim.peters) *  |
Date: 2002-03-22 22:41 |
Logged In: YES user_id=31435 Would it make sense for builtin_compile() to append a newline itself (say, if one weren't already present)? |
|
|
msg8679 - (view) |
Author: Guido van Rossum (gvanrossum) *  |
Date: 2002-03-22 22:46 |
Logged In: YES user_id=6380 Probably, unless the start symbol is "expr" (which doesn't need a newline). But it would mean copying a potentially huge string -- we can't append the \n in place. |
|
|
msg8680 - (view) |
Author: Tim Peters (tim.peters) *  |
Date: 2002-03-22 23:01 |
Logged In: YES user_id=31435 Well, the user can't append an '\n' inplace either. The question is whether we do that for them, or let it blow up. OTOH, codeop.py has a lot of fun now trying to compile as-is, tben with one '\n' tacked on, then with two of 'em. It would take me a long time to figure out exactly why it's doing all that, and to guess exactly how it would break. |
|
|
msg8681 - (view) |
Author: David Bolen (db3l) * |
Date: 2002-03-22 23:06 |
Logged In: YES user_id=53196 If compile() is being used in exec mode with a non- terminated multi-line string, it's not going to work unless the application generates that copy itself in any event. So without an interpreter fix, I'd think the string copy is inevitable, and it might simplify things to have the builtin function take care of it. It's something easy to overlook at the application level and could thus be fixed in one place rather than at each point of use. On the other hand, I also noticed something I overlooked when first encountering the problem - the 2.2 docs added some text to compile() talking about this need for termination. So it could be argued that it's now a documented restriction, and should the newline append (with any requisite string duplication) be needed, it leaves it to the individual applications rather than forcing it in the builtin. Not to mention a documentation solution could thus be declared already done. |
|
|
msg8682 - (view) |
Author: Neil Schemenauer (nascheme) *  |
Date: 2002-03-22 23:14 |
Logged In: YES user_id=35752 I'm +1 on builtin_compile adding the newline. It's the lazy way out and it's better than every person hacking with the parser stumbling into it and coming up with their own work around. Guido? |
|
|
msg8683 - (view) |
Author: Guido van Rossum (gvanrossum) *  |
Date: 2002-03-23 00:35 |
Logged In: YES user_id=6380 Hm, adding it to builtin_compile isn't enough. We'd have to add it to exec as well. I think the lexer and/or parser should take care of this -- just as it should take care of accepting \r as well as \n as well as \r\n. Yes, it's hard to find. But there's got to be a way. |
|
|
msg8684 - (view) |
Author: Guido van Rossum (gvanrossum) *  |
Date: 2003-02-13 22:21 |
Logged In: YES user_id=6380 Fixed. This was simpler than I thought (only 4 lines in parsetok.c) but also harder (codeop.py depend on the broken behavior!) Most of the changes checked in had to do with adding a flag to enable the old behavior. :-( I note that Tim's second comment is visionary in this respect. |
|
|
msg8685 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2003-02-21 07:43 |
Logged In: YES user_id=80475 Backport? |
|
|
msg8686 - (view) |
Author: Guido van Rossum (gvanrossum) *  |
Date: 2003-02-21 12:43 |
Logged In: YES user_id=6380 No, too much code (have you seen the number of associated checkins?) |
|
|