(original) (raw)

We've been talking this week about ideas for speeding up the parsing of Longs coming out of files or network.  The use case is having a large string with embeded Long's and parsing them to real longs.  One approach would be to use a simple slice:

long(mystring[x:y])

an expensive operation in a tight loop.  The proposed solution is to add further keyword arguments to Long (such as):

long(mystring, base=10, start=x, end=y)

The start/end would allow for negative indexes, as slices do, but otherwise simply limit the scope of the parsing.  There are other solutions, using buffer-like objects and such, but this seems like a simple win for anyone parsing a lot of text.  I implemented it in a branch 
runar-longslice-branch, but it would need to be updated with Tim's latest improvements to long.  Then you may ask, why not do it for everything else parsing from string--to which I say it should.  Thoughts?


--
Runar Petursson
Betur
runar@betur.net -- http://betur.net