[Python-Dev] Internal representation of strings and Micropython (original) (raw)

Paul Sokolovsky pmiscml at gmail.com
Thu Jun 5 03:01:38 CEST 2014


Hello,

On Thu, 05 Jun 2014 12:03:17 +1200 Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:

Serhiy Storchaka wrote: > html.HTMLParser, json.JSONDecoder, re.compile, tokenize.tokenize > don't use iterators. They use indices, str.find and/or regular > expressions. Common use case is quickly find substring starting > from current position using str.find or re.search, process found > token, advance position and repeat.

For that kind of thing, you don't need an actual character index, just some way of referring to a place in a string. Instead of an integer, str.find() etc. could return a StringPosition,

That's more brave then I had in mind, but definitely shows what alternative implementation have in store to fight back if some perfomance problems are actually detected. My own thoughts were, for example, as response to people who (quoting) "slice strings for living" is some form of "extended slicing" like str[(0, 4, 6, 8, 15)].

But I really think that providing iterator interface for common string operations would cover most of real-world cases, and will be actually beneficial for Python language in general.

-- Greg

-- Best regards, Paul mailto:pmiscml at gmail.com



More information about the Python-Dev mailing list