[Python-Dev] Proposal: defaultdict (original) (raw)

Ian Bicking ianb at colorstudy.com
Sat Feb 18 01:32:23 CET 2006


Martin v. Löwis wrote:

Adam Olsen wrote:

Consider these two pieces of code:

if key in d: dosomething(d[key]) else: dosomethingelse() try: dosomething(d[key]) except KeyError: dosomethingelse() Before they were the same (assuming dosomething() won't raise KeyError). Now they would behave differently. I personally think they should continue to do the same thing, i.e. "in" should return True if there is a default; in the current proposal, it should invoke the default factory.

As I believe Fredrik implied, this would break the symmetry between "x in d" and "x in d.keys()" (unless d.keys() enumerates all possible keys), and either .get() would become useless, or it would also act in inconsistent ways. I think these broken expectations are much worse than what Adam's talking about.

But that's beside the point: Where is the real example where this difference would matter? (I'm not asking for a realistic example, I'm asking for a real one)

Well, here's a kind of an example: WSGI specifies that the environment must be a dictionary, and nothing but a dictionary. I think it would have to be updated to say that it must be a dictionary with default_factory not set, as default_factory would break the predictability that was the reason WSGI specified exactly a dictionary (and not a dictionary-like object or subclass). So there's something that becomes brokenish.

I think this is the primary kind of breakage -- dictionaries with default_factory set are not acceptable objects when a "plain" dictionary is expected. Of course, it can always be claimed that it's the fault of the person who passes in such a dictionary (they could have passed in None and it would probably also be broken). But now all of the sudden I have to say "x(a) takes a dictionary argument. Oh, and don't you dare use the default_factory feature!" where before I could just say "dictionary". And KeyError just... disappears. KeyError is one of those errors that you expect to happen (maybe the "Error" part is a misnomer); having it disappear is a major change.

Also, I believe there's two ways to handle thread safety, both of which are broken:

  1. d[key] gets the GIL, and thus while default_factory is being called the GIL is locked

  2. d[key] doesn't get the GIL and so d[key].append(1) may not actually lead to 1 being in d[key] if another thread is appending something to the same key at the same time, and the key is not yet present in d.

Admittedly I don't understand the ins and outs of the GIL, so the first case might not actually need to acquire the GIL.

-- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org



More information about the Python-Dev mailing list