[Python-Dev] Subclassing int? [Was: Re: [PEP] += on return of function call result] (original) (raw)

Alex Martelli python-list@python.org
Fri, 9 May 2003 09:49:33 +0200


Followups set to python-list since this is NOT an appropriate subject matter for python-dev. Please continue the discussion on python-list, thanks.

On Thursday 08 May 2003 10:20 pm, Beat Bolli wrote: ...

count = {} for word in wordlist: count.setdefault(word, 0) += 1

This, as I soon realized, didn't work, exactly because ints are immutable.

Actually it doesn't work because you cannot assign to a function call; the fact that ints are immutable doesn't enter the picture.

class Counter(int): def inc(self): # to be defined self += 1??

HERE is where the fact that ints are immutable will bite. If += mutated self, this would work -- but it doesn't because ints are immutable.

As you can see, I have a problem at the comment: how do I access the inherited int value??? I realized that this also wasn't going to work,

int(self) will "access the inherited int value" if I understand your meaning. But it doesn't help you here.

either. I finally used the perhaps idiomatic

count = {} for word in wordlist: count[word] = count.get(word, 0) + 1 which of course is suboptimal, because the lookup is done twice. I decided

Yes.

not to implement a proper Counter class for memory efficiency reasons. The

slots fix your memory efficiency issues: that's the REASON they exist. However, there's ANOTHER problem...:

code would have been simple:

class Counter: def init(self): self.n = 0 def inc(self): self.n += 1 def get(self): return self.n count = {} for word in wordlist: count.setdefault(word, Counter()).inc() But to restate the core question: can class Counter be written as a subclass of int?

No (not meaningfully).

The performance tradeoff is tricky not because of memory considerations (which slots fix) but because you're generating (and often throwing away) a Counter instance EVERY time. Witness:

[alex@lancelot Lib]$ python timeit.py -s''' count = {} words = "some are and some are not and some are irksome".split() ''' 'for w in words:' ' count[w]=count.get(w,0)+1' 100000 loops, best of 3: 11.6 usec per loop

versus:

[alex@lancelot Lib]$ python timeit.py -s''' count = {} words = "some are and some are not and some are irksome".split() class Cnt(object): slots=["n"] def init(self): self.n=0 def inc(self): self.n+=1 ''' 'for w in words:' ' count.setdefault(w,Cnt()).inc()' 10000 loops, best of 3: 43.4 usec per loop

See? It's not a speedup, but a slowdown by about FOUR times in this example.

If you want speed, go for speed:

[alex@lancelot Lib]$ python timeit.py -s''' count = {} words = "some are and some are not and some are irksome".split() import psyco psyco.full() ''' 'for w in words:' ' count[w]=count.get(w,0)+1' 100000 loops, best of 3: 3.33 usec per loop

Now THIS is acceleration -- a speedup of over THREE times. And without any complication nor abandonment of the idiomatic way of expression, too.

Beat Bolli (please CC: me on replys, I'm not on the list)

Done. But please use python-list for these discussions: python-dev is only for discussion about development of Python itself.

Alex