[Python-Dev] deque alternative (was: Linked lists) (original) (raw)
Tim Peters tim.peters at gmail.com
Mon Dec 26 18:56:27 CET 2005
- Previous message: [Python-Dev] deque alternative (was: Linked lists)
- Next message: [Python-Dev] deque alternative (was: Linked lists)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[restoring context and attributions lost in the original]
[Tim Peters]
Like Phillip Eby, I use 2-tuples for this when I feel the need (usually during a backtracking graph search, to keep track of paths back to the root in a space-efficient way), and am happy with that.
[Josiah Carlson]
Then there's the whole Python list with append() and pop(). It takes a method resolution and function call, but at least in Python 2.3, it is a hair faster...
[Martin Blais]
Josiah, that's beside the point, because it is not space-efficient, the list is actually an dynamic array/vector, and I would expect an efficient array implementation to grow exponentially. One of the reasons we're talking about lists is to avoid exactly this problem...
[James Y Knight]
Okay, now that reason is a pretty crazy one. Consider what you're saying here.
I'm sure he did ;-) Consider a very simple graph, a skinny tree
rooted at A
that branches once "at the end", represented by a dict
of adjacency lists:
kids[A] = [B] kids[B] = [C] kids[C] = [D] ... kids[W] = [X] kids[X] = [Y, Z]
A natural representation of the path from B back to the root is:
Bpath = B, (A, None)
and from C back to the root:
Cpath = C, Bpath
and from D back to the root:
Dpath = D, Cpath
...
and from X back to the root:
Xpath = X, Wpath
A "flat" list or tuple would certainly be more space-efficient up to this point. But when the graph branches, the 2-tuple representation allows sharing the common path suffix (which may be very long!):
Ypath = Y, Xpath
and Zpath = Z, Xpath
If there's an N-way branch, using 2-tuples allows recording the N new paths back to the root with incremental memory burden N * some_constant. You can't do this kind of sharing with a flat list or tuple, and the incremental memory burden at an N-way branch zooms to (N + some_other_constant) * (the amount of memory needed to record the path from the branch point back to the root), due to duplicating the back-to-the-root info N times. The potential memory saving from using 2-tuples instead is unbounded.
...
Plus, in python, the overhead per object in the pyobject memory allocation system, which I don't know how to quantify.
For objects requiring a number of bytes divisible by 8, a few percent at worst. The per-object space overhead in obmalloc consists entirely of internal padding needed to reach a multiple of 8 bytes per allocation unit, and that's 0 when the # of bytes needed is divisible by 8. The only obmalloc space overheads then are due to per-pool and per-arena bookkeeping space, and those are small fractions of total pool and arena sizes.
...
So, the list will generally be 1/8th of its size overallocated, or 112 elements plus 9 words overhead is 121 words. Better than any cons- linked list could be, and insanely better than a cons-linked list would be in python.
You seem to be ignoring possiblities for sharing across lists, and such sharing is natural in many graph algorithms.
- Previous message: [Python-Dev] deque alternative (was: Linked lists)
- Next message: [Python-Dev] deque alternative (was: Linked lists)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]