Issue 22766: collections.Counter's in-place operators should return NotImplemented for unsupported types (original) (raw)

Created on 2014-10-30 15:06 by Joshua.Chin, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (28)

msg230271 - (view)

Author: Joshua Chin (Joshua.Chin) *

Date: 2014-10-30 15:06

Currently, in-place operations on 'collections.Counter' with unsupported types raises an 'AttributeError'.

Example:

import collections counter = collections.Counter() counter += 1 Traceback (most recent call last): File "", line 1, in File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/collections/init.py", line 709, in iadd for elem, count in other.items(): AttributeError: 'int' object has no attribute 'items'

Instead, it should return 'NotImplemented' if 'other' is not a 'collections.Counter'

msg230272 - (view)

Author: R. David Murray (r.david.murray) * (Python committer)

Date: 2014-10-30 15:56

That would prevent it from working with "work alike" (duck type) classes, though.

msg230273 - (view)

Author: Ethan Furman (ethan.furman) * (Python committer)

Date: 2014-10-30 16:00

As I noted in my review, the docstring specifically says "other Counter".

If we want to relax that we could check for an 'items' attribute and 'return NotImplemented' if it isn't there, but one or the other should definitely happen.

msg230277 - (view)

Author: R. David Murray (r.david.murray) * (Python committer)

Date: 2014-10-30 17:50

'counter' in the docstrings is in lower case, so that says nothing dispositive. However, add does an ininstance check, so it is hard to see why iadd does not.

Personally I'd drop the isinstance checks and let the errors bubble up as they may. Why should subtract explicitly support other data types, but not sub/isub? But Raymond's opinion should hold the most weight here.

msg230278 - (view)

Author: Ethan Furman (ethan.furman) * (Python committer)

Date: 2014-10-30 18:00

Indeed -- we mostly discuss with each other to try and sway his opinion. :)

stdlib types should not let every error bubble up. Consider a dict:

--> d = {} --> d += 2 Traceback (most recent call last): File "", line 1, in TypeError: unsupported operand type(s) for +=: 'dict' and 'int'

now look at Counter

--> from collections import Counter --> c = Counter() --> c += 2 Traceback (most recent call last): File "", line 1, in File "/home/ethan/source/python/cpython/Lib/collections/init.py", line 709, in iadd for elem, count in other.items(): AttributeError: 'int' object has no attribute 'items'

Counter is not user-friendly in this case.

There are other areas of Counter that accept arbitrary mappings, so I would be fine the ixxx methods also accepting arbitrary mappings, but if the thing passed in will not work with Counter, then returning NotImplemented is the appropriate course of action.

msg230280 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2014-10-30 18:04

I also think returning NotImplemented would be the right thing here.

msg230389 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2014-10-31 21:12

Here is what other mutable types do:

s = [] s.iadd(1) Traceback (most recent call last): File "<pyshell#1>", line 1, in s.iadd(1) TypeError: 'int' object is not iterable

s = set() s.ior(1) NotImplemented

from collections import deque s = deque() s.iadd(1) Traceback (most recent call last): File "<pyshell#6>", line 1, in s.iadd(1) TypeError: 'int' object is not iterable

from array import array s = array('i') s.iadd(1) Traceback (most recent call last): File "<pyshell#10>", line 1, in s.iadd(1) TypeError: can only extend array with array (not "int")

s = bytearray() s.iadd(1) Traceback (most recent call last): File "<pyshell#12>", line 1, in s.iadd(1) TypeError: can't concat int to bytearray

msg230399 - (view)

Author: Ethan Furman (ethan.furman) * (Python committer)

Date: 2014-10-31 22:18

The behavior exhibited by set() is correct, and the behavior for array and bytearray are fine, as the error message is even more helpful.

Issues opened for list () and deque ().

msg230400 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2014-10-31 22:18

Note, the += operation is conceptually similar to update() methods with are usually duck typeable:

d = {} d.update(1) Traceback (most recent call last): File "<pyshell#16>", line 1, in d.update(1) TypeError: 'int' object is not iterable

s = set() s.update(1) Traceback (most recent call last): File "<pyshell#18>", line 1, in s.update(1) TypeError: 'int' object is not iterable

dict(1) Traceback (most recent call last): File "<pyshell#19>", line 1, in dict(1) TypeError: 'int' object is not iterable

msg230402 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2014-10-31 22:23

I think I've changed my mind on this. The important distinction is not between "ducktyping" or "cooperating". It's that this is an in-place mutating operation. A mutating operation should always be the responsibility of the receiver, and so it sounds wrong to be able to delegate it to the read-only operand.

For example, when calling list.extend(op), the list object doesn't allow op to take over the operation's semantics. So I think TypeError is the right outcome here.

msg230405 - (view)

Author: Ethan Furman (ethan.furman) * (Python committer)

Date: 2014-10-31 22:31

https://docs.python.org/2/reference/datamodel.html#object.iadd

[...] These methods should attempt to do the operation in-place (modifying self) and return the result (which could be, but does not have to be, self). If a specific method is not defined, the augmented assignment falls back to the normal methods. For instance, to execute the statement x += y, where x is an instance of a class that has an iadd() method, x.iadd(y) is called. If x is an instance of a class that does not define a iadd() method, x.add(y) and y.radd(x) are considered, as with the evaluation of x + y.


Returning NotImplemented still allows a TypeError to be raised, is subclass friendly, and is the way Python is designed.

msg230406 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2014-10-31 22:35

You misread that paragraph:

""" For instance, to execute the statement x += y, where x is an instance of a class that has an iadd() method, x.iadd(y) is called."""

This is the present case, and the case of most mutable containers.

"""If x is an instance of a class that does not define a iadd() method, x.add(y) and y.radd(x) are considered, as with the evaluation of x + y."""

This is not the present case.

msg230410 - (view)

Author: Ethan Furman (ethan.furman) * (Python committer)

Date: 2014-10-31 23:01

Nevertheless, that is the behavior if NotImplemented is returned by a method. Here's some code to demonstrate:

--8<--------------------------------------- from collections import Counter

class Spam(int): "for sake of example" def radd(self, other): other[self] = other[self] + 1 return other

s = Spam(5) c = Counter() print(c) c += s print(c) c += 9 --8<---------------------------------------

before the patch

Counter() Traceback (most recent call last): File "blah.py", line 13, in c += s File "/home/ethan/source/python/issue20284/Lib/collections/init.py", line 738, in iadd for elem, count in other.items(): AttributeError: 'Spam' object has no attribute 'items'

after the patch

Counter() Counter({5: 1}) Traceback (most recent call last): File "blah.py", line 13, in c += 9 TypeError: unsupported operand type(s) for +=: 'Counter' and 'int'

As you can see, we get better support for other objects that know how to add themselves to the container in question, and a nicer and more correct error message for objects that do not.

As I said earlier, the only decision we should have to make here is whether to check for a Counter, or just something with a .items attribute, but either way we should be returning NotImplemented.

msg230411 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2014-10-31 23:06

Yes, it's true that the message is better. If we don't want to change behaviour, we should simply catch the AttributeError and return NotImplemented from there.

msg230429 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2014-11-01 06:42

Sorry, I don't want to change the kind of exception being raised (an API change from AttributeError to TypeError) without a really good reason.

In general, in-place methods are not required to return NotImplemented (for example, (3).iadd(4.5) raises an AttributeError).

Also, I prefer the current AttributeError with its clear indication that an items() method is needed for duck-typing. These kind of error messages are very helpful when you're trying to figure-out how to duck-type on-purpose (for example, {}.update(1) and {}.update([[]]) both provide the information about what you would need to do to get update() to work).

The current duck-typeable behavior was an intended part of the design and is no different from a number of other methods that update in-place.

FWIW, I do recognize that in the specific case of "counter += 1" a TypeError is clearer than an AttributeError. However, for duck-typeable methods, an AttributeError can sometimes be useful to indicate what is actually being required of the "other" argument (for example, Python 2 old-style classes raise an AttributeError with an informative message when calling an instance without a call method or indexing an instance without a getitem method).

At any rate, I want to avoid unnecessary API churn (and avoid contributing to what Guido has called "a death by a thousand cuts" for the growing list of tiny semantic differences between Python 2 and Python 3).

msg230446 - (view)

Author: R. David Murray (r.david.murray) * (Python committer)

Date: 2014-11-01 13:08

I agree with Raymond about the value of the more specific error message. And the point about the in-place operators being qualitatively different from the non-inplace operators is a good one.

msg230533 - (view)

Author: Ethan Furman (ethan.furman) * (Python committer)

Date: 2014-11-03 10:53

I don't want to change the kind of exception being raised (an API change from AttributeError to TypeError) without a really good reason.

Subclasses cannot work with the current implementation.

In general, in-place methods are not required to return NotImplemented (for example, (3).iadd(4.5) raises an AttributeError).

AttributeError is being raised because int does not have an iadd method, not because of a problem with the float operand.

Also, I prefer the current AttributeError with its clear indication that an items() method is needed for duck-typing. These kind of error messages are very helpful when you're trying to figure-out how to duck-type on-purpose

That's what the docs are for.

(for example, {}.update(1) and {}.update([[]]) both provide the information about what you would need to do to get update() to work).

update is not a binary operation, so has more leeway in how it handles problems.

The current duck-typeable behavior was an intended part of the design and is no different from a number of other methods that update in-place.

The only problem with the design is that it does not play well with others. For duck-typeable just do a check on 'other' to see if it has an .items() method, and return NotImplemented if it does not. Oh, and check that 'self' is Counter, and also return NotImplemented if that fails.

At any rate, I want to avoid unnecessary API churn (and avoid contributing to what Guido has called "a death by a thousand cuts" for the growing list of tiny semantic differences between Python 2 and Python 3).

Not an issue in this case, as the 2.7 Counter does not have the in-place methods.

Here's proof of subclass trouble -- the idea is to make an immutable Counter:

from collections import Counter

class FrozenCounter(Counter): """ immutable after definition """ def add(self, other): new_counter = self.class() for elem, count in self.items(): new_counter[elem] = count for elem, count in other.items(): new_counter[elem] += count return new_counter._keep_positive()
fc = FrozenCounter("abbc") sc = FrozenCounter("bubba") original = fc fc += sc assert fc == FrozenCounter("aabbbbbcu") assert fc is not original

and the results:

Traceback (most recent call last): File "blah.py", line 20, in assert fc is not original AssertionError

For subclassing to work, the fix is:

    if not hasattr(other, 'items') or type(self) is not Counter:
        return NotImplemented

msg230536 - (view)

Author: R. David Murray (r.david.murray) * (Python committer)

Date: 2014-11-03 13:49

On Mon, 03 Nov 2014 10:53:09 +0000, Ethan Furman <report@bugs.python.org> wrote:

Ethan Furman added the comment:

I don't want to change the kind of exception being raised (an API change from AttributeError to TypeError) without a really good reason.

Subclasses cannot work with the current implementation.

I don't understand this statement.

Also, I prefer the current AttributeError with its clear indication that an items() method is needed for duck-typing. These kind of error messages are very helpful when you're trying to figure-out how to duck-type on-purpose

That's what the docs are for.

That's not the way it works in real life. You implement something, it doesn't work, and if the solution isn't obvious from the error message then you look at the docs or google. But if the error message says "you can't do this", then like as not you don't even look at the docs.

(for example, {}.update(1) and {}.update([[]]) both provide the information about what you would need to do to get update() to work).

update is not a binary operation, so has more leeway in how it handles problems.

As Raymond pointed out, the augmented assignment operator is a shortcut for update.

The current duck-typeable behavior was an intended part of the design and is no different from a number of other methods that update in-place.

The only problem with the design is that it does not play well with others. For duck-typeable just do a check on 'other' to see if it has an .items() method, and return NotImplemented if it does not. Oh, and check that 'self' is Counter, and also return NotImplemented if that fails.

No, that doesn't support duck typing. Duck typing is not putting arbitrary restrictions on what objects you accept, but only essential restrictions. And as noted above, if you return NotImplemented, then you get an error message that essentially says "this is impossible" instead of "this is what didn't work".

Here's proof of subclass trouble -- the idea is to make an immutable Counter:

Your subclass doesn't override iadd or any of the other mutation methods. Why would you expect it to be immutable if it doesn't?

For subclassing to work, the fix is:

    if not hasattr(other, 'items') or type(self) is not Counter:
        return NotImplemented

I've never seen a type check like that in a Python class, and I hope I never do. That breaks subclassing.

msg230544 - (view)

Author: Jim Jewett (Jim.Jewett) * (Python triager)

Date: 2014-11-03 15:56

If I know that a Counter (or any class X) can be updated in place, I will be surprised to find out that I'm using a different instance after a successful in-place operation.

I would even consider that (replacement of the original instance) a security risk, though it is mitigated by the fact that I don't see how to exploit it without already being able to create arbitrary subclasses.

For subclassing to work, the fix is:

    if not hasattr(other, 'items') or type(self) is not Counter:
        return NotImplemented

That breaks for Counter subclasses on the left hand side -- even a trivial subclass to change the repr would fail unless the _i* methods were all re-implemented, either by the otherwise-trivial Counter subclass or by the right-hand objects. If you start making it work for the obvious cases, you just make the failure conditions more obscure.

msg230547 - (view)

Author: Ethan Furman (ethan.furman) * (Python committer)

Date: 2014-11-03 17:38

Ethan stated:

The only problem with the design is that it does not play well with others. For duck-typeable just do a check on 'other' to see if it has an .items() method, and return NotImplemented if it does not. Oh, and check that 'self' is Counter, and also return NotImplemented if that fails.

R. David Murray replied:

No, that doesn't support duck typing. Duck typing is not putting arbitrary restrictions on what objects you accept, but only essential restrictions.

iadd will raise an AttributeError if 'other' doesn't have an 'items' method -- how is that an arbitrary restriction?

If 'other' doesn't have an 'items' but does know how to add itself to Counter, it won't matter because Counter is raising an exception instead of returning NotImplemented.

What other types return NotImplemented when 'other' is not the expected (sub)type? bytes, bytearray, int, float, string, dict, list, tuple, Enum, deque, defaultdict, OrderedDict, Decimal, Fraction, and Counter itself for every binary method /except/ the four inplace ops it supplies itself.

And as noted above, if you return NotImplemented, then you get an error message that essentially says "this is impossible" instead of "this is what didn't work".

Correct implementation should not be sacrificed for a nicer error message.

Here's proof of subclass trouble -- the idea is to make an immutable Counter: [snip example]

Your subclass doesn't override iadd or any of the other mutation methods. Why would you expect it to be immutable if it doesn't?

You're right, that was a bad example. A subclass would need to override any method whose behavior it wants to change.

What does not seem correct to me is raising an exception because the type of 'other' is not correct, instead of returning NotImplemented, and in 14 other core datatypes it does work that way, and even in Counter it works that way for any in-place method that Counter itself doesn't override, and it does work that way /in/ Counter for the normal binary ops that it /does/ override.

For subclassing to work, the fix is:

    if not hasattr(other, 'items') or type(self) is not Counter:
        return NotImplemented

I've never seen a type check like that in a Python class, and I hope I never do. That breaks subclassing.

Explanations and examples go a lot further than bare statements.

[Jim Jewitt posts an explanation.]

Ah, I see. Thank you, Jim.

Okay, I agree that the check against the base class is ill-advised (and I modified some of the above reply to match that understanding).

However, I am still unconvinced that 'other' should not be checked. Hopefully the python-dev thread will shed more light on this.

msg230559 - (view)

Author: Ethan Furman (ethan.furman) * (Python committer)

Date: 2014-11-03 20:21

Okay, the python-dev ruling is in, and raising an exception in the ixxx methods is allowed, and even a good idea in the cases of mutable containers.

However, doing the check on 'other' and raising a TypeError with an appropriate message would still be better for a couple reasons:

If backwards compatibility is a concern in this case, a custom "TypeError" that was a subclass of both TypeError and AttributeError could address that.

msg230665 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2014-11-05 08:17

However, doing the check on 'other' and raising a TypeError with an appropriate message would still be better

Let's be clear. These are duck-typed methods. A type check is inappropriate. Anything with o.items() is allowed regardless of type.

Also, I generally won't approve changes to existing APIs without compelling real-world use cases to motivate the design change. Otherwise, you just create unnecessary churn and consternation.

msg230692 - (view)

Author: Ethan Furman (ethan.furman) * (Python committer)

Date: 2014-11-05 16:04

Raymond declared:

Let's be clear. These are duck-typed methods. A type check is inappropriate. Anything with o.items() is allowed regardless of type.

Wikipedia explains (http://en.wikipedia.org/wiki/Duck_typing):

In computer programming with object-oriented programming languages, duck typing is an alternative to typing. In duck typing, an object's suitability for some purpose is determined by the presence of certain methods and properties [...]

I did use an actual 'type' check in one of my exmaples, and that was wrong.

It is possible to do a "duck-type check" with a hasattr(other, 'items').

I don't use Counter myself -- I'll try and find some real-world examples.

msg230694 - (view)

Author: R. David Murray (r.david.murray) * (Python committer)

Date: 2014-11-05 16:12

There is no purpose served by changing the AttributeError into a TypeError. It's just extra unneeded code.

msg230695 - (view)

Author: Ethan Furman (ethan.furman) * (Python committer)

Date: 2014-11-05 16:36

I've posted to python-list and python-dev. I'll report back here the findings, if any.

msg230741 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2014-11-06 13:48

I've posted to python-list and python-dev. I'll report back here the findings, if any.

http://comments.gmane.org/gmane.comp.python.devel/150073

msg230749 - (view)

Author: Jim Jewett (Jim.Jewett) * (Python triager)

Date: 2014-11-06 15:21

I wish there were an APIMismatchError superclass to unify (AttributeError, TypeError). But the wart probably isn't enough to justify the surgery.

On Thu, Nov 6, 2014 at 8:48 AM, Serhiy Storchaka <report@bugs.python.org> wrote:

Serhiy Storchaka added the comment:

I've posted to python-list and python-dev. I'll report back here the findings, if any.

http://comments.gmane.org/gmane.comp.python.devel/150073


nosy: +serhiy.storchaka


Python tracker <report@bugs.python.org> <http://bugs.python.org/issue22766>


msg230807 - (view)

Author: Ethan Furman (ethan.furman) * (Python committer)

Date: 2014-11-07 14:34

No real-world use-cases have surfaced. Many thanks to everyone's explanations and examples.

History

Date

User

Action

Args

2022-04-11 14:58:09

admin

set

github: 66955

2014-11-07 14:34:32

ethan.furman

set

messages: +

2014-11-06 15:21:38

Jim.Jewett

set

messages: +

2014-11-06 13:48:44

serhiy.storchaka

set

nosy: + serhiy.storchaka
messages: +

2014-11-05 16:36:06

ethan.furman

set

messages: +

2014-11-05 16:12:11

r.david.murray

set

messages: +

2014-11-05 16:04:33

ethan.furman

set

messages: +

2014-11-05 08:17:29

rhettinger

set

messages: +

2014-11-03 20:21:46

ethan.furman

set

messages: +

2014-11-03 17:38:25

ethan.furman

set

messages: +

2014-11-03 15:56:36

Jim.Jewett

set

nosy: + Jim.Jewett
messages: +

2014-11-03 13:49:43

r.david.murray

set

messages: +

2014-11-03 10:53:09

ethan.furman

set

messages: +

2014-11-01 13:08:13

r.david.murray

set

messages: +

2014-11-01 06:42:44

rhettinger

set

status: open -> closed
resolution: not a bug
messages: +

2014-10-31 23:06:14

pitrou

set

messages: +

2014-10-31 23:01:19

ethan.furman

set

messages: +

2014-10-31 22:35:52

pitrou

set

messages: +

2014-10-31 22:31:20

ethan.furman

set

messages: +

2014-10-31 22:23:13

pitrou

set

messages: +

2014-10-31 22🔞47

rhettinger

set

messages: +

2014-10-31 22🔞07

ethan.furman

set

messages: +

2014-10-31 21:12:48

rhettinger

set

priority: normal -> low

messages: +

2014-10-30 18:04:49

pitrou

set

versions: - Python 2.7, Python 3.4
nosy: + pitrou

messages: +

type: behavior -> enhancement

2014-10-30 18:00:49

ethan.furman

set

messages: +

2014-10-30 17:50:23

r.david.murray

set

messages: +

2014-10-30 16:00:26

ethan.furman

set

messages: +

2014-10-30 15:56:30

r.david.murray

set

nosy: + r.david.murray
messages: +

2014-10-30 15:25:02

ethan.furman

set

versions: + Python 2.7, Python 3.5

2014-10-30 15:15:59

ethan.furman

set

assignee: rhettinger

nosy: + rhettinger, ethan.furman

2014-10-30 15:06:37

Joshua.Chin

create