[Python-Dev] Proposal: defaultdict (original) (raw)

Steve Holden steve at holdenweb.com
Sun Feb 19 04:44:37 CET 2006


Guido van Rossum wrote:

On 2/16/06, Guido van Rossum <guido at python.org> wrote:

Over lunch with Alex Martelli, he proposed that a subclass of dict with this behavior (but implemented in C) would be a good addition to the language. It looks like it wouldn't be hard to implement. It could be a builtin named defaultdict. The first, required, argument to the constructor should be the default value. Remaining arguments (even keyword args) are passed unchanged to the dict constructor. Thanks for all the constructive feedback. Here are some responses and a new proposal. - Yes, I'd like to kill setdefault() in 3.0 if not sooner. - It would indeed be nice if this was an optional feature of the standard dict type. - I'm ignoring the request for other features (ordering, key transforms). If you want one of these, write a PEP! - Many, many people suggested to use a factory function instead of a default value. This is indeed a much better idea (although slightly more cumbersome for the simplest cases). One might think about calling it if it were callable, otherwise using it literally. Of course this would require jiggery-pokery int eh cases where you actually wantes the default value to be a callable (you'd have to provide a callable to return the callable as a default).

- Some people seem to think that a subclass constructor signature must match the base class constructor signature. That's not so. The subclass constructor must just be careful to call the base class constructor with the correct arguments. Think of the subclass constructor as a factory function. True, but then this does get in the way of treating the base dict and its defaulting subtype polymorphically. That might not be a big issue.

- There's a fundamental difference between associating the default value with the dict object, and associating it with the call. So proposals to invent a better name/signature for setdefault() don't compete. (As to one specific such proposal, adding an optional bool as the 3rd argument to get(), I believe I've explained enough times in the past that flag-like arguments that always get a constant passed in at the call site are a bad idea and should usually be refactored into two separate methods.)

- The inconsistency introduced by getitem() returning a value for keys while get(), contains(), and keys() etc. don't show it, cannot be resolved usefully. You'll just have to live with it. Modifying get() to do the same thing as getitem() doesn't seem useful -- it just takes away a potentially useful operation. So here's a new proposal. Let's add a generic missing-key handling method to the dict class, as well as a defaultfactory slot initialized to None. The implementation is like this (but in C): def onmissing(self, key): if self.defaultfactory is not None: value = self.defaultfactory() self[key] = value return value raise KeyError(key) When getitem() (and only getitem()) finds that the requested key is not present in the dict, it calls self.onmissing(key) and returns whatever it returns -- or raises whatever it raises. getitem() doesn't need to raise KeyError any more, that's done by onmissing(). The onmissing() method can be overridden to implement any semantics you want when the key isn't found: return a value without inserting it, insert a value without copying it, only do it for certain key types/values, make the default incorporate the key, etc. But the default implementation is designed so that we can write d = {} d.defaultfactory = list to create a dict that inserts a new list whenever a key is not found in getitem(), which is most useful in the original use case: implementing a multiset so that one can write d[key].append(value) to add a new key/value to the multiset without having to handle the case separately where the key isn't in the dict yet. This also works for sets instead of lists: d = {} d.defaultfactory = set ... d[key].add(value) This seems like a very good compromise.

[non-functional alternatives ...] >regards Steve

Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/



More information about the Python-Dev mailing list