Issue 11343: Make errors due to full parser stack identifiable (original) (raw)

Created on 2011-02-26 22:39 by Trundle, last changed 2022-04-11 14:57 by admin.

Messages (29)

msg129604 - (view)

Author: Andreas Stührk (Trundle) *

Date: 2011-02-26 22:39

Currently, if the parser's internal stack is full (as it happens when the parser tries to parse a deeply nested structure), the parser writes an error message to stderr and a bare MemoryError is raised. That way, it is really hard for REPLs do handle input that causes the parser to exhaust its stack correctly (see e.g. python -c "print '(' * 100 + ')' * 100" | python -c "import code; code.interact()").

In my opinion, a SyntaxError would be great because that way one can easily see where in the source code the error is caused. One reason against that (stated by MvL in ) is that it's really syntactically correct Python and only the parser can't parse it. But then, the same is also true when there are too many indentation levels in which case an IndentationError is raised (which is a subclass of SyntaxError).

I have attached two patches: The first still raises a MemoryError (for the sake of backward compatibility) but adds a message ("too many opening parens") to it. The second raises a SyntaxError instead.

msg130076 - (view)

Author: Terry J. Reedy (terry.reedy) * (Python committer)

Date: 2011-03-04 21:30

I agree that compile-time stack exhaustion is different from runtime object-heap exhaustion and could/should have a different error.

I agree with Martin (from 2000) that SyntaxError is not right either. Perhaps a new ParseError subclass thereof. I believe IndentationError was added since 2000. Its doc is "Base class for syntax errors related to incorrect indentation." and it has TabError as a subclass, so its use for too many indents (not really 'incorrect') is not obvious. But having the position marked (if it would be) would be a plus.

I presume REPL == read-eval-print-loop (from Google). Would a new error help such programs (like code.interact, or IDLE)?

msg132063 - (view)

Author: Andreas Stührk (Trundle) *

Date: 2011-03-25 01:05

On Fri, Mar 4, 2011 at 9:30 PM, Terry J. Reedy <report@bugs.python.org> wrote:

I agree with Martin (from 2000) that SyntaxError is not right either. Perhaps a new ParseError subclass thereof.

I added a new ParserError that inherits from SyntaxError.

I presume REPL == read-eval-print-loop (from Google). Would a new error help such programs (like code.interact, or IDLE)?

Yes, I meant a read-eval-print-loop. A new error would help and if it's a subclass of SyntaxError, they most likely need no changes at all to handle that new error, as they have to deal with SyntaxErrors anyway.

msg134778 - (view)

Author: Andreas Stührk (Trundle) *

Date: 2011-04-29 16:04

FWIW, this also affects ast.literal_eval().

msg140456 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2011-07-15 18:53

This looks like a rather good idea, although I'm not sure it deserves a new global exception type.

msg140469 - (view)

Author: Benjamin Peterson (benjamin.peterson) * (Python committer)

Date: 2011-07-15 19:47

Just make it a SyntaxError.

msg140470 - (view)

Author: Benjamin Peterson (benjamin.peterson) * (Python committer)

Date: 2011-07-15 19:48

Oh, I see Martin disagrees. SyntaxError is raised for anything which Python won't compile. For example, too many arguments gets a SyntaxError.

msg140479 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2011-07-16 07:56

+1 for a new exception type to indicate "this may technically be legal Python, but this Python implementation cannot handle it correctly"

Whatever exception type we add, it would be nice to also be able to use it if someone eventually fixes the compiler recursion crasher, so I'd like to paint this particular bikeshed as the more general "SyntaxLimitError".

Possible docs phrasing:

exception:: SyntaxLimitError

Raised when the compilation and code generation process encounters an error due to internal limitations of the current implementation (e.g. the CPython parser has an upper limit on the number of nested sets of parentheses it can handle in a single expression, but no such limitation exists in the Python language reference). This is a subclass of :exc:SyntaxError.

msg140480 - (view)

Author: Georg Brandl (georg.brandl) * (Python committer)

Date: 2011-07-16 08:24

I like SyntaxLimitError much better than ParserError.

msg140496 - (view)

Author: Benjamin Peterson (benjamin.peterson) * (Python committer)

Date: 2011-07-16 14:43

2011/7/16 Nick Coghlan <report@bugs.python.org>:

Nick Coghlan <ncoghlan@gmail.com> added the comment:

+1 for a new exception type to indicate "this may technically be legal Python, but this Python implementation cannot handle it correctly"

Whatever exception type we add, it would be nice to also be able to use it if someone eventually fixes the compiler recursion crasher, so I'd like to paint this particular bikeshed as the more general "SyntaxLimitError".

What is the point of a new exception? When would you ever want to catch it as opposed to a regular SyntaxError? You're going to have to change the code either way.

Moreover, all Python implementations are going to have to place some limits on stack depth and recursion, so it's not really an implementation detail.

The Python language doesn't make an mention of limited memory, but no one is going to suggest a MemoryLimitError, which is an "implementation detail".

msg140498 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2011-07-16 15:10

It's important to remember that other implementations treat CPython as the "gold standard" for compatibility purposes. If we declare something to be an ordinary SyntaxError, then that carries strong implications for what other implementations should do.

Some kinds of errors are inherently implementation specific (MemoryError and SystemError spring to mind). No sane implementor is going to try to match CPython like-for-like when it comes to those. SyntaxError, however, is typically defined by the language definition, not the CPython implementation of it. By introducing a new exception type, we're explicitly telling other implementations "Look, this is our problem that we don't plan to fix as it doesn't typically arise in real code, but please don't deliberately cripple your own implementation just to match this behaviour". I'm strongly with MvL on this one - SyntaxError itself should never be raised for legal Python code just because the CPython bytecode generation toolchain isn't able to handle it properly.

msg140499 - (view)

Author: Benjamin Peterson (benjamin.peterson) * (Python committer)

Date: 2011-07-16 15:16

2011/7/16 Nick Coghlan <report@bugs.python.org>:

Nick Coghlan <ncoghlan@gmail.com> added the comment:

It's important to remember that other implementations treat CPython as the "gold standard" for compatibility purposes. If we declare something to be an ordinary SyntaxError, then that carries strong implications for what other implementations should do.

Some kinds of errors are inherently implementation specific (MemoryError and SystemError spring to mind). No sane implementor is going to try to match CPython like-for-like when it comes to those. SyntaxError, however, is typically defined by the language definition, not the CPython implementation of it. By introducing a new exception type, we're explicitly telling other implementations "Look, this is our problem that we don't plan to fix as it doesn't typically arise in real code, but please don't deliberately cripple your own implementation just to match this behaviour". I'm strongly with MvL on this one - SyntaxError itself should never be raised for legal Python code just because the CPython bytecode generation toolchain isn't able to handle it properly.

Implementations are quite good at figuring this stuff out themselves. We don't need a whole new exception (new docs, one more thing in the builtin namespace) to "send a message" to other implementations. If some implementation is not part of the language, then its test just just be marked cpython_only. It shouldn't be brought into public API and the language.

msg140500 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2011-07-16 15:22

It also makes it clear to users whether they've just run up against a limitation of the implementation they're using or whether what they've written is genuinely illegal code. They are NOT the same thing. Attempting to conflate them into one exception for the mere sake of not adding a new exception type is a false economy. Take it to python-dev if you want, but I will strongly oppose this check going in with it raising an ordinary SyntaxError instead of a new subclass.

msg140501 - (view)

Author: Benjamin Peterson (benjamin.peterson) * (Python committer)

Date: 2011-07-16 15:33

2011/7/16 Nick Coghlan <report@bugs.python.org>:

Nick Coghlan <ncoghlan@gmail.com> added the comment:

It also makes it clear to users whether they've just run up against a limitation of the implementation they're using or whether what they've written is genuinely illegal code. They are NOT the same thing. Attempting to conflate them into one exception for the mere sake of not adding a new exception type is a false economy. Take it to python-dev if you want, but I will strongly oppose this check going in with it raising an ordinary SyntaxError instead of a new subclass.

Why would it matter to users what variety of limitation they're running up against? As I said, they're still going to have to change their code.

msg140502 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2011-07-16 15:53

It matters, because Python "users" are programmers, and most programmers want to know why they're being told something is wrong. Raising MemoryError is technically incorrect at this point, but at least gives the right flavour of the user not doing anything specifically wrong, the program just ran out of resources trying to do as they asked. I see SyntaxError as significantly worse, however, as instead of telling the developer "sorry, that's a reasonable request, but I can't handle it", the interpreter is now telling them "you did that wrong, change it". The semantics of that are all wrong. There's nothing wrong with the syntax of the user's code, it's just that the compiler can't handle it.

Similarly, the eventual fix to the recursive crash in the compiler itself is likely to be an arbitrary limit on expression nesting. Again, nothing wrong with their syntax, but they'll need to adjust their (legal!) code to work around implementation limitations.

msg140504 - (view)

Author: Benjamin Peterson (benjamin.peterson) * (Python committer)

Date: 2011-07-16 16:50

2011/7/16 Nick Coghlan <report@bugs.python.org>:

Nick Coghlan <ncoghlan@gmail.com> added the comment:

It matters, because Python "users" are programmers, and most programmers want to know why they're being told something is wrong. Raising MemoryError is technically incorrect at this point, but at least gives the right flavour of the user not doing anything specifically wrong, the program just ran out of resources trying to do as they asked. I see SyntaxError as significantly worse, however, as instead of telling the developer "sorry, that's a reasonable request, but I can't handle it", the interpreter is now telling them "you did that wrong, change it". The semantics of that are all wrong. There's nothing wrong with the syntax of the user's code, it's just that the compiler can't handle it.

I content that in normal code, it is so extremely rare as to be unheard of, to get exceptions about the parser stack overflowing or segfault the compiler by too deep nesting. People who are doing this (generally to prove the point about limitations of the compiler) are smart enough to know exactly what "parser stack overflowed" means and separate that from Python as the language. Adding a new exception is solving a non-existent problem.

msg140506 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2011-07-16 22:28

I content that in normal code, it is so extremely rare as to be unheard of, to get exceptions about the parser stack overflowing or segfault the compiler by too deep nesting. People who are doing this (generally to prove the point about limitations of the compiler) are smart enough to know exactly what "parser stack overflowed" means and separate that from Python as the language. Adding a new exception is solving a non-existent problem.

Agreed with Benjamin. Also, "why something is wrong" can simply be told in the exception message.

msg140507 - (view)

Author: R. David Murray (r.david.murray) * (Python committer)

Date: 2011-07-16 23:13

This is a purity versus practicality argument. I agree with both sides :)

msg140517 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2011-07-17 06:32

After further reflection, I (eventually) came to the same conclusion as Benjamin and Antoine - using SyntaxError for this is fine.

However, the exception message should make it clear that this is an implementation limit rather than a language limit.

msg140631 - (view)

Author: Senthil Kumaran (orsenthil) * (Python committer)

Date: 2011-07-18 23:28

Oh, I thought we never rely on exception "message" for anything important. However this seems to be an exception for that exception. :-)

msg140635 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2011-07-18 23:34

Oh, I thought we never rely on exception "message" for anything important. However this seems to be an exception for that exception. :-)

I think you're missing the point. People usually don't catch SyntaxError and run fallback code in this case. And even if they do, they are unlikely to care about whether the ErreurDeSyntaxe is due to a genuinely faulty piece of code, or a parser limitation ;)

msg140646 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2011-07-19 02:11

Yeah, at the level of code the origin doesn't really matter, so re-using the exception type is actually OK. However, for people seeing the stack trace, it may be useful to know what's genuinely illegal syntax and what's a limitation of the current implementation. The exception message is a good place for that additional detail.

msg396883 - (view)

Author: Irit Katriel (iritkatriel) * (Python committer)

Date: 2021-07-02 23:10

The patch relates to the old parser.

With the new parser the 100*(+100*) example works. If we go to 1000 instead of 100 we get "SyntaxError: too many nested parentheses".

From the discussion it seems that the idea of a new exception type for implementation-limits vs "real" SyntaxErrors was rejected as not useful.

Is there anything left to do on this issue?

msg397125 - (view)

Author: Terry J. Reedy (terry.reedy) * (Python committer)

Date: 2021-07-08 03:41

I retract my original comment as I now agree with SyntaxError + distinct message.

Given Irit's report, I think the remaining question is whether

A. The implication that 'too many nested parentheses' is compiler specific is strong enough, and any thing further should be left to 3rd parties to explain, and this issue should be closed as 'out of date'.

B. We should be explicit and add something like 'for this compiler'.

Since Pablo has been working on improving error messages, I would give his opinion high weight.

To find the actual limit, I did the following with Win 10, 3.10.0b3, IDLE:

def f(n): ... print(eval('('*n + ')'*n")) ... And after some trials, found f(200) () f(201) Traceback (most recent call last): File "<pyshell#56>", line 1, in f(201) File "<pyshell#42>", line 2, in f print(eval('('*n + ')'*n)) File "", line 1 ((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((())))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))) ^ SyntaxError: too many nested parentheses

The error is detected in 'string' line 1, 1-based offset 201, at the final '('. There are 200 spaces before the caret but they are not obvious in this box, which does not wrap spaces properly. The end_lineno and end_offset are None, None rather than the 1,202 I expected.

msg397133 - (view)

Author: Andre Roberge (aroberge) *

Date: 2021-07-08 09:45

For information: I created an actual .py file from the last example with 200 parens and got the following error message:

python example.py s_push: parser stack overflow MemoryError

For consistency and greater clarity, I think that having messages such as

and possibly others (too many arguments?), would be preferable, with the possibility of simply having

if the exact cause cannot be identified at the parsing stage.

msg397151 - (view)

Author: Pablo Galindo Salgado (pablogsal) * (Python committer)

Date: 2021-07-08 16:38

python example.py s_push: parser stack overflow MemoryError

That error can only happen with the old parser, so you must be using Python3.8 or 3.9 with the old parser. Can you confirm?

msg397152 - (view)

Author: Andre Roberge (aroberge) *

Date: 2021-07-08 16:43

@pablo: Sincere apologies ... I tested this with the wrong virtual environment active locally. With 3.10, it is indeed SyntaxError: too many nested parentheses

msg397153 - (view)

Author: Pablo Galindo Salgado (pablogsal) * (Python committer)

Date: 2021-07-08 16:43

As a piece of extra information, this limit has been there for a while:

https://github.com/python/cpython/blob/91a8f8c16ca9a7e2466a8241d9b41769ef97d094/Parser/tokenizer.h#L14

the problem is that the old parser died before reaching it in many situations.

msg397154 - (view)

Author: Pablo Galindo Salgado (pablogsal) * (Python committer)

Date: 2021-07-08 16:46

Regarding what to do, what I would love to optimize is that the message is clear. One of the challenges people may have with too specific messages is that they may not know what "a compiler" or "a parser" means (the first may be especially confusing since many people think Python is not "compiled".

On the other hand, is more informative to say "too many nested parentheses" than "parser stack overflow" since many people don't know what that means, especially in Python.

I

History

Date

User

Action

Args

2022-04-11 14:57:13

admin

set

github: 55552

2021-07-08 16:46:57

pablogsal

set

messages: +

2021-07-08 16:43:38

pablogsal

set

messages: +

2021-07-08 16:43:04

aroberge

set

messages: +

2021-07-08 16:38:49

pablogsal

set

messages: +

2021-07-08 09:45:22

aroberge

set

messages: +

2021-07-08 03:41:02

terry.reedy

set

versions: + Python 3.11, - Python 3.4
nosy: + aroberge, pablogsal

messages: +

stage: patch review -> needs patch

2021-07-02 23:10:59

iritkatriel

set

nosy: + iritkatriel
messages: +

2013-06-30 05:58:25

terry.reedy

set

versions: + Python 3.4, - Python 3.3

2011-07-19 02:11:21

ncoghlan

set

messages: +

2011-07-18 23:34:38

pitrou

set

messages: +

2011-07-18 23:28:02

orsenthil

set

nosy: + orsenthil
messages: +

2011-07-17 06:32:24

ncoghlan

set

messages: +

2011-07-16 23:13:20

r.david.murray

set

nosy: + r.david.murray
messages: +

2011-07-16 22:28:55

pitrou

set

messages: +

2011-07-16 16:50:34

benjamin.peterson

set

messages: +

2011-07-16 15:53:44

ncoghlan

set

messages: +

2011-07-16 15:52:27

Arfrever

set

nosy: + Arfrever

2011-07-16 15:33:25

benjamin.peterson

set

messages: +

2011-07-16 15:22:45

ncoghlan

set

messages: +

2011-07-16 15:16:04

benjamin.peterson

set

messages: +

2011-07-16 15:10:41

ncoghlan

set

messages: +

2011-07-16 14:43:34

benjamin.peterson

set

messages: +

2011-07-16 08:24:36

georg.brandl

set

nosy: + georg.brandl
messages: +

2011-07-16 07:56:41

ncoghlan

set

messages: +

2011-07-15 20:38:39

eric.snow

set

nosy: + eric.snow

2011-07-15 19:48:27

benjamin.peterson

set

messages: +

2011-07-15 19:47:05

benjamin.peterson

set

messages: +

2011-07-15 18:53:11

pitrou

set

nosy: + pitrou, benjamin.peterson, ncoghlan

messages: +
stage: patch review

2011-04-29 16:04:17

Trundle

set

messages: +

2011-03-25 01:07:09

Trundle

set

files: - parser_nested_MemoryError.patch

2011-03-25 01:07:03

Trundle

set

files: - parser_nested_SyntaxError.patch

2011-03-25 01:05:27

Trundle

set

files: + issue11343.patch

messages: +

2011-03-04 21:30:10

terry.reedy

set

nosy: + loewis, terry.reedy
messages: +

2011-02-26 22:40:29

Trundle

set

files: + parser_nested_SyntaxError.patch
nosy:marienz, Trundle

2011-02-26 22:39:50

Trundle

create