[Python-3000] Invalid \U escape in source code give hard-to-trace error (original) (raw)
Kurt B. Kaiser kbk at shore.net
Wed Jul 18 08:04:13 CEST 2007
- Previous message: [Python-3000] Invalid \U escape in source code give hard-to-trace error
- Next message: [Python-3000] Invalid \U escape in source code give hard-to-trace error
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
"Guido van Rossum" <guido at python.org> writes:
When a source file contains a string literal with an out-of-range \U escape (e.g. "\U12345678"), instead of a syntax error pointing to the offending literal, I get this, without any indication of the file or line:
UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position 0-9: illegal Unicode character This is quite hard to track down. (Both the location of the bad literal in the source file, and the origin of the error in the parser. :-) Can someone come up with a fix? I note that raw escapes show a slightly different error. I also note that the same issue exists for u"..." literals in Python 2.5.
For what it's worth, I posted a patch to ast.c against the 2.6 trunk which massages the unicode exception into a SyntaxError showing the location.
That approach lets unicodeobject.c handle the gory details while ast.c handles the SyntaxError generation. It might be a solution until something deeper along the lines of Martin's thoughts is possibly developed.
I don't think that any reference adjustments are needed, but someone should check the patch.
-- KBK
- Previous message: [Python-3000] Invalid \U escape in source code give hard-to-trace error
- Next message: [Python-3000] Invalid \U escape in source code give hard-to-trace error
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]