[Python-Dev] Proposal: defaultdict (original) (raw)

Ian Bicking ianb at colorstudy.com
Fri Feb 17 18:58:30 CET 2006


Raymond Hettinger wrote:

Over lunch with Alex Martelli, he proposed that a subclass of dict with this behavior (but implemented in C) would be a good addition to the language I would like to add something like this to the collections module, but a PEP is probably needed to deal with issues like: * implications of a getitem succeeding while get(value, x) returns x (possibly different from the overall default) * implications of a getitem succeeding while contains would fail * whether to add this to the collections module (I would say yes) * whether to allow default functions as well as default values (so you could instantiate a new default list) * comparing all the existing recipes and third-party modules that have already done this * evaluating its fitness for common use cases (i.e. bags and dict of lists).

It doesn't seem that useful for bags, assuming we're talking about an {object: count} implementation of bags; bags should really have a more set-like interface than a dict-like interface.

A dict of lists typically means a multi-valued dict. In that case it seems like x[key_not_found] should return the empty list, as that means zero values; even though zero values also means that x.has_key(key_not_found) should return False as well. but getting x[key_not_found] does not (for a multi-valued dict) mean that suddently has_key should return true. I find the side-effect nature of getitem as proposed in default_dict to be rather confusing, and when reading code it will very much break my expectations. I assume that attribute access and [] access will not have side effects. Coming at it from that direction, I'm -1, though I'm +1 on dealing with the specific use case that started this (x.setdefault(key, []).append(value)).

An implementation targetted specifically at multi-valued dictionaries seems like it would be better. Incidentally, on Web-SIG we've discussed wsgiref, and it includes a mutli-values, ordered, case-insensitive dictionary. Such a dictionary(ish) object has clear applicability for HTTP headers, but certainly it is something I've used many times elsewhere. In a case-sensitive form it applies to URL variables. Really there's several combinations of features, each with different uses.

So we have now...

dicts: unordered, key:value (associative), single-value sets: unordered, not key:value, single-value lists: ordered, not key:value, multi-value

We don't have...

bags: unordered, not key:value, multi-value multi-dict: unordered, key:value, multi-value ordered-dict: ordered, key:value, single-value ordered-multi-dict: ordered, key:value, single-value

For all key:value collections, normalized keys can be useful. (Though notably the wsgiref Headers object does not have normalized keys, but instead does case-insensitive comparisons.) I don't know where dict-of-dict best fits in here.

-- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org



More information about the Python-Dev mailing list