Issue 1447633: "reindent.py" exposes bug in tokenize (original) (raw)
I use up-to-date Debian unstable (i368 port) on a PC with a AMD Athlon64 +3500 chip. I compile my own copy of Python which I keep in /usr/local.
Here is a small Python program called "fixnames.py":
#! /usr/bin/env python
"""Rename files that contain unpleasant characters.
Modify this code as needed. """ import os, sys, optparse
usage = 'USAGE: ./fixnames.py [-h] ' parser = optparse.OptionParser(usage=usage) options, args = parser.parse_args() if len(args) != 1: parser.print_help() sys.exit('an argument is required'))
The input is a list of files to be renamed.
for name in open(args[0]), 'r'): # Modify these as needed. newname = name.replace(' ', '') newname = newname.replace('@', 'at') newname = newname.replace('%20', '') newname = newname.replace("'", '') os.rename(name, newname)
If I run
python /usr/local/src/Python-2.4.2/Tools/scripts/reindent.py fixnames.py
I get Traceback (most recent call last): File "/usr/local/src/Python-2.4.2/Tools/scripts/reindent.py", line 293, in ? main() File "/usr/local/src/Python-2.4.2/Tools/scripts/reindent.py", line 83, in main check(arg) File "/usr/local/src/Python-2.4.2/Tools/scripts/reindent.py", line 108, in check if r.run(): File "/usr/local/src/Python-2.4.2/Tools/scripts/reindent.py", line 166, in run tokenize.tokenize(self.getline, self.tokeneater) File "/usr/local/lib/python2.4/tokenize.py", line 153, in tokenize tokenize_loop(readline, tokeneater) File "/usr/local/lib/python2.4/tokenize.py", line 159, in tokenize_loop for token_info in generate_tokens(readline): File "/usr/local/lib/python2.4/tokenize.py", line 236, in generate_tokens raise TokenError, ("EOF in multi-line statement", (lnum, 0)) tokenize.TokenError: ('EOF in multi-line statement', (24, 0))
Logged In: YES user_id=31435
What do you think the bug is? That is, what did you expect to happen? tokenize.py isn't a syntax checker, so this looks like a case of garbage-in, garbage-out to me. There are two lines in the sample program that contain a right parenthesis that shouldn't be there, and if those syntax errors are repaired then tokenize.py is happy with the program. As is, because of the unbalanced parentheses the net paren level isn't 0 when tokenize reaches the end of the file, so something is wrong with the file, and "EOF in multi-line statement" is just its heurestic guess at the most likely cause.