[Python-3000] _heapq.c, etc. (was Re: Heaptypes) (original) (raw)

Guido van Rossum guido at python.org
Fri Jul 20 16:44:09 CEST 2007


On 7/20/07, Josiah Carlson <jcarlson at uci.edu> wrote:

"Guido van Rossum" <guido at python.org> wrote: > On 7/19/07, Guido van Rossum <guido at python.org> wrote: > > How about instead you help with fixing pickling of datetime objects? > > This broke when I fixed testpickle. Rolling back your changes to > > datetime pickling didn't seem to help. > > Never mind; this was shallow -- cPickle doesn't pickle bytes > correctly. I've decided to get rid of cPickle -- someone is writing a > replacement for the summer of code anyway. The new approach will be > that you always write "import pickle" and this transparently attempts > to use the C accelerator if it can be imported, like heapq.py and > heapq.c. On a related note, since I had been supporting only Python 2.3 for quite a while, I didn't notice the fact that Python's heapq.c (in 2.4 at least, I haven't tested on 2.5) only supported lists as containers, and not a list-like object with all methods that heapq calls (which was an issue for a pure-Python pair heap implementation I posted last December or so). What made it really annoying is that there was no way to tell the heapq module not to load the C version so that I could use a generic container. I ended up just commenting out the C module heapq import and moving on. I don't know if we want to make it possible to disable the loading of certain C modules that don't offer all of the same features, or if we want to limit the Python versions to what the C versions support, or even if we want to expand the C versions to handle all cases that the Python versions support. While the pickle/cPickle, StringIO/cStringIO, etc., naming can be a bit annoying, it does give me the choice whether I want it to be fast or flexible.

This was an example of a performance improvement that changed the specs of an API in an incompatible way. Breaking your code was an unintended side effect of the speedup.

We're going to do a few more of these in Py3k, and this time breaking the specs is the name of the game. I think going forward (post 3.0) we should be more careful to write specs that can easily be optimized without breaking existing usage, or writing speedups that can handle all the argument types that the original code supported.

I definitely don't want to continue the old habit of having a slow and a fast module with different names; the experience with especially cPickle and cStringIO is that everyone believes their code is performance critical and hence uses the C version if it exists, thereby repeating the same idiom over and over.

-- --Guido van Rossum (home page: http://www.python.org/~guido/)



More information about the Python-3000 mailing list