[Python-Dev] heapq (original) (raw)

Agthorr agthorr@barsoom.org
Sat, 19 Apr 2003 15:41:11 -0700


Hello,

I'm new to this list, so I will begin by introducing myself. I'm a graduate student at the University of Oregon working towards my PhD. My primary area of research is in peer-to-peer networks. I use Python for a variety of purposes such as constructing web pages, rapid prototyping, and building test frameworks. I have been a Python user for at least two years.

I must confess that I have not lurked on the list much before making this post. I did search back in the list though, so in theory I won't be bringing up a rehashed topic...

Recently, I had need of a heap in Python. I didn't see one in the 2.2 distribution, so I went and implemented one. Afterwards, I wondered if this might be useful to others, so I decided to investigate if any work had been done to add a heap to Python's standard library.

Low and behold, in CVS there is a module called "heapq".

I compared my implementation with heapq, and I see some important differences. I'm not going to unilaterally state that mine is better, but I thought it would be worthwhile to raise the differences in this forum, so that an informed decision is made about The Best Way To Do Things.

Hopefully, it will not be too terribly controversial :)

The algorithms used are more or less identical, I'm primarily concerned with the differences in interface.

As written, heapq provides some functions to maintain the heap priority on a Python list. By contrast, I implemented the heap as an opaque class that maintains a list internally. By creating this layer of abstraction, it is possible to completely change the heap implementation later, without worrying about affecting user programs. For example, it would be possible to switch to using Fibonacci Heaps or the Pairing Heaps mentioned by Tim Peters in this message: http://mail.python.org/pipermail/python-dev/2002-August/027531.html

Another key difference is that my implementation supports the decrease_key() operation that is important for algorithms such as Dijkstra's. This requires a little extra bookkeeping, but it's just a small constant factor ;) For the API, my insert() function returns an opaque key that can later be used as a parameter to the adjust_key() function.

For those who like looking at source code, my implementation is here: http://www.cs.uoregon.edu/~agthorr/heap.py

-- Dan Stutzbach