[Python-Dev] deque alternative (was: Linked lists) (original) (raw)

Josiah Carlson jcarlson at uci.edu
Mon Dec 26 20:04:15 CET 2005


[restoring context and attributions lost in the original]

[Tim Peters] >>>> Like Phillip Eby, I use 2-tuples for this when I feel the need >>>> (usually during a backtracking graph search, to keep track of paths >>>> back to the root in a space-efficient way), and am happy with that. [Josiah Carlson] >>> Then there's the whole Python list with append() and pop(). It >>> takes a method resolution and function call, but at least in >>> Python 2.3, it is a hair faster... [Martin Blais] >> Josiah, that's beside the point, because it is not space-efficient, >> the list is actually an dynamic array/vector, and I would expect an >> efficient array implementation to grow exponentially. One of the >> reasons we're talking about lists is to avoid exactly this problem... [James Y Knight] > Okay, now that reason is a pretty crazy one. Consider what you're > saying here.

[Tim Peters]

I'm sure he did ;-) Consider a very simple graph, a skinny tree rooted at A that branches once "at the end", represented by a dict of adjacency lists:

Are you sure?

A "flat" list or tuple would certainly be more space-efficient up to this point. But when the graph branches, the 2-tuple representation allows sharing the common path suffix (which may be very long!): ... If there's an N-way branch, using 2-tuples allows recording the N new paths back to the root with incremental memory burden N * someconstant. You can't do this kind of sharing with a flat list or tuple, and the incremental memory burden at an N-way branch zooms to (N + someotherconstant) * (the amount of memory needed to record the path from the branch point back to the root), due to duplicating the back-to-the-root info N times. The potential memory saving from using 2-tuples instead is unbounded.

But one doesn't need to use flat lists. If one were to combine cons cells with Python lists, you can get the space efficiency of lists with the reusability of the desired cons cells (or tuples as the case may be). How? (i, l), where i is the prefix of list l which is the desired portion of whatever l represents. You toss one of those anywhere in your 'flat list', and you have your stored/saved path with a couple dozen bytes. If you are not careful, you end up storing/saving paths which would be stored more efficiently by copying the contents, but at that point we are talking about a constant factor blowup, not the (potentially) exponential blowup of storing all paths as copies, and we can always copy to be more efficient if necessary.

I have personally used this trick to construct a union-find data structure in which all unions are reversable regardless of find operation. [1]

> So, the list will generally be 1/8th of its size overallocated, or > 112 elements plus 9 words overhead is 121 words. Better than any cons- > linked list could be, and insanely better than a cons-linked list > would be in python.

You seem to be ignoring possiblities for sharing across lists, and such sharing is natural in many graph algorithms.

Not necessarily so as I have pointed out above. You said yourself; practicality beats purity. While using cons cells are pure, as us using only flat lists are pure, flat lists are practically smaller than cons cells in certain cases (by a factor of ~4), and non-flat lists can be smaller than cons cells in the rest of the cases.

[1] If you remember, the ammortized running time of O(n) unions and finds on a union-find data structure is O(nalpha(n,n)), where alpha(n,n) is never larger than 5 in practice. The space overhead of using non-flat lists as shared paths provides sufficient information to offer reversable union operations in ammortized O(nalpha(n,n)) space.



More information about the Python-Dev mailing list