[Python-Dev] Small tweak to tokenize.py? (original) (raw)

Guido van Rossum guido at python.org
Sat Dec 2 19:06:50 CET 2006

Previous message: [Python-Dev] Small tweak to tokenize.py?
Next message: [Python-Dev] a feature i'd like to see in python #1: better iteration control
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 12/2/06, Fredrik Lundh <fredrik at pythonware.com> wrote:

Guido van Rossum wrote:

>> it would be a good thing if it could, optionally, be made to report >> horizontal whitespace as well. > > It's remarkably easy to get this out of the existing API sure, but it would be even easier if I didn't have to write that code myself (last time I did that, I needed a couple of tries before the parser handled all cases correctly...). but maybe this could simply be handled by a helper generator in the tokenizer module, that simply wraps the standard tokenizer generator and inserts whitespace tokens where necessary?

A helper sounds like a promising idea. Anyone interested in volunteering a patch?

> keep track > of the end position returned by the previous call, and if it's > different from the start position returned by the next call, slice the > line text from the column positions, assuming the line numbers are the > same.If the line numbers differ, something has been eating \n tokens; > this shouldn't happen any more with my patch.

you'll still have to deal with multiline strings, right?

No, they are returned as a single token whose start and stop correctly reflect line/col of the begin and end of the token. My current code (based on the second patch I gave in this thread and the algorithm described above) doesn't have to special-case anything except the ENDMARKER token (to break out of its loop :-).

-- --Guido van Rossum (home page: http://www.python.org/~guido/)

Previous message: [Python-Dev] Small tweak to tokenize.py?
Next message: [Python-Dev] a feature i'd like to see in python #1: better iteration control
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list