[Python-3000] Making more effective use of slice objects in Py3k (original) (raw)

Guido van Rossum guido at python.org
Tue Aug 29 17:42:18 CEST 2006


On 8/28/06, Josiah Carlson <jcarlson at uci.edu> wrote:

"Guido van Rossum" <guido at python.org> wrote: > Those are all microbenchmarks. It's easy to prove the superiority of > an approach that way. But what about realistic applications? What if > your views don't end up saving memory or time for an application, but > still cost in terms of added complexity in all string operations? At no point has anyone claimed that every operation on views will always be faster than on strings. Nor has anyone claimed that it will always reduce memory consumption. However, for a not insignificant number of operations, views can be faster, offer better memory use, etc.

I agree with Jean-Paul Calderone: "If the goal is to avoid speeding up Python programs because views are too complex or unpythonic or whatever, fine. But there isn't really any question as to whether or not this is a real optimization."

And without qualification that is as false as anything you've said.

"I don't think we see people overusing buffer() in ways which damage readability now, and buffer is even a builtin. Tossing something off into a module somewhere shouldn't really be a problem. To most people who don't actually know what they're doing, the idea to optimize code by reducing memory copying usually just doesn't come up."

Another "yes they do -- no they don't" argument. As I've said repeatedly before, optimizations are likely to be copied without being understood by newbies. The buffer() built-in has such a poor reputation and API that it doesn't get much play; but a new "views" feature that will magically make all your string processing go faster surely will.

While there are examples where views can be slower, this is no different than the cases where deque is slower than list; sometimes some data structures are more applicable to the problem than others. As we have given users the choice to use a structure that has been optimized for certain behaviors (set and deque being primary examples), this is just another structure that offers improved performance for some operations.

As long as it is very carefully presented as such I have much less of a problem with it.

Earlier proposals were implying that all string ops should return views whenever possibe. That, I believe, is never going to fly, and that's where my main objection lies.

Having views in a library module alleviates many of my objections. While I still worry that it will be overused, deque doesn't seem to be overused, so perhaps I should relax.

> Then I ask you to make it so that string views are 99.999% > indistinguishable from strings -- they have all the same methods, are > usable everywhere else, etc.

For reference, I'm about 2 hours into it (including re-reading the documentation for Pyrex), and I've got [r]partition, [r]find, [r]index, [r|l]strip. I don't see significant difficulty implementing all other methods on views. Astute readers of the original implementation will note that I never check that the argument being passed in is a string; I use the buffer interface, so anything offering the buffer interface can be seen as a read-only view with string methods attached. Expect a full implementation later this week.

Good luck!

-- --Guido van Rossum (home page: http://www.python.org/~guido/)



More information about the Python-3000 mailing list