[Python-3000] Making more effective use of slice objects in Py3k (original) (raw)

Guido van Rossum guido at python.org
Sun Aug 27 17:50:39 CEST 2006


On 8/26/06, Jim Jewett <jimjjewett at gmail.com> wrote:

> > As I understand it, Nick is suggesting that slice > > objects be used as a sequence (not just string) > > view.

> I have a hard time parsing this sentence. A slice is > an object with three immutable attributes -- start, > stop, step. How does this double as a string view? Poor wording on my part; it is (the application of a slice to a specific sequence) that could act as copyless view. For example, you wanted to keep the rarely used optional arguments to find because of efficiency.

I don't believe they are rarely used. They are (currently) essential for code that searches a long string for a short substring repeatedly. If you believe that is a rare use case, why bother coming up with a whole new language feature to support it?

s.find(prefix, start, stop)

does not copy.

That's still really poor wording. If you want to make your case you should take more time explaining it right.

If slices were less eager at copying, this could be rewritten as

view=slice(start, stop, 1) view(s).find(prefix)

Now you're postulating that calling a slice will take a slice of an object? Any object? And how is that supposed to work for arbitrary objects? I would think that it ought to be a method on the string object -- surely a view on a string will have to be a different type of object than a few on a list and that ought to be different again from a view on a unicode string. Also you're postulating that the slice object somehow has the same methods as the thing it slices? How are you expecting to implement that? (Don't tell me that you haven't thought about implementation yet. Without a plan implementation there is no feature.)

or perhaps even as

s[start:stop].find(prefix)

That will never fly. NumPy may get away with non-copying slices, but for built-in objects this would be too big of a departure of current practice. (If you don't stop about this I'll have to add it to PEP 3099. :-)

I'm not sure these look better, but they are less surprising, because they don't depend on optional arguments that most people have forgotten about.

Because they're not that important except to the few people who really need the optimization. Also they're easily looked up.

> Maybe the idea is that instead of

> pos = s.find(t, pos) > we would write > pos += stringview(s)[pos:].find(t) > ??? With stringviews, you wouldn't need to be reindexing from the start of the original string. The idiom would instead be a generalization of "for line in file:" while data: chunk, sep, data = data.partition() but the partition call would not need to copy the entire string; it could simply return three views.

That depends. I can imagine situations where the indices are needed regardless of how you code it.

Yes, this does risk keeping all of data alive because one chunk was saved. This might be a reasonable tradeoff to avoid the copying. If not, perhaps the gc system could be augmented to shrink bloated views during idle moments.

Keep dreaming on. it really seems you have no clue about implementation issues; you just keep postulating random solutions whenever you're faced with an objection.

-- --Guido van Rossum (home page: http://www.python.org/~guido/)



More information about the Python-3000 mailing list