[Python-3000] Two proposals for a new list-like type: one modest, one radical (original) (raw)
Daniel Stutzbach daniel at stutzbachenterprises.com
Tue Apr 24 00:54:00 CEST 2007
- Previous message: [Python-3000] Two proposals for a new list-like type: one modest, one radical
- Next message: [Python-3000] Two proposals for a new list-like type: one modest, one radical
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 4/23/07, Mike Klaas <mike.klaas at gmail.com> wrote:
The init-from iterable is especially interesting: a 20% speed increase for list comps is not to be scoffed at.
That one is because list()'s tp_new method is PyType_GenericNew instead of a wrapper around the more efficient PyList_New. Probably list comprehensions and the [] syntax already call PyList_New, but I haven't checked.
Have you analyzed the memory consumption of the two data structures?
I haven't done any empirical studies, but I can give you a pretty good picture.
For small lists, a BList uses a fixed amount of memory rather than aiming for the "just large enough" for Python lists. So, for very small lists, BLists are currently rather wasteful. The BList code could be changed so that for small lists they also do the "just large enough" thing, at the expense of a bit more code complexity.
For large lists, the worst case for BLists is around twice the memory requirements of a regular Python list (each BList node is required to be at least half-full). I know that sounds horrible, but hear me out. :-) The best case for BLists is when each node is completely full, in which case they use negligibly more memory than a regular Python list.
If a user creates a BList from whole-cloth (e.g., BList(iterable)), they will get a best-case BList. This would probably be the common case for user's who don't plan to modify their list much.
If a user is doing lots of inserts and deletes, the BList's node's will typically be three-quarters full, translating into 33% extra memory above a regular Python list. However (and this is a big however!), these users will greatly appreciate the BList's additional speed for these operations.
I hasten to add that these memory estimates are only about the space required to store the list structure as well as pointers to the objects in the list. The objects themselves (e.g., integers, class instances, whatever) will take up substantially more memory in any case. For example, for the regular 32-bit integer type, a single object takes up 12 bytes. A regular python list will spend an extra 4 bytes per integer (16 bytes total). A worst-case BList spends an extra 8 bytes per integer (20 bytes total). So, worst case is only a 25% increase.
If the user makes use of operations that leverage the copy-on-write feature, then BLists are much more memory efficient that regular lists. getslice is probably the most commonly used function in this category. A slice of a large BList uses only O(log n) memory in the worst case.
-- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises LLC
- Previous message: [Python-3000] Two proposals for a new list-like type: one modest, one radical
- Next message: [Python-3000] Two proposals for a new list-like type: one modest, one radical
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]