Issue 13360: UnicodeWarning raised on sequence and set comparisons (original) (raw)

Created on 2011-11-06 20:22 by flox, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (6)
msg147181 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2011-11-06 20:22
The UnicodeWarning is raised on some dict or set operations. It is not very helpful, and sometimes annoying. And it is somewhat inconsistent. # ** warning not raised ** $ python2.7 -c "print u'd\xe9' in {'foo', 'bar'}" False $ python2.7 -c "print 'd\xe9' in {u'foo', u'bar'}" False $ python2.7 -c "print 'd\xc3\xa9' in {u'foo', u'd\xe9'}" False # ** warning raised ** $ python2.7 -c "print 'd\xe9' in {u'foo', u'd\xe9'}" -c:1: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal False $ python2.7 -c "print u'd\xe9' in {'foo', 'd\xe9'}" -c:1: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal False
msg147183 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2011-11-06 20:46
Similar expressions where the warning is raised or not (depending on "latin-1" comparison): $ python2.7 -c "print u'd\xe9' in {'foo', 'd\xe9r'}" False $ python2.7 -c "print u'd\xe9' in {'foo', 'd\xe9'}" -c:1: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal False $ python2.7 -c "print 'd\xe9r' in {u'foo', u'd\xe9'}" False $ python2.7 -c "print 'd\xe9' in {u'foo', u'd\xe9'}" -c:1: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal False
msg147187 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011-11-06 21:23
I fail to see the issue. What exactly is the problem with the warning? It looks all consistent and helpful to me.
msg147188 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2011-11-06 21:43
Often sequences or sets have heterogeneous keys, mixing and , and in this case there's no easy way to work with them without raising this UnicodeWarning. The "logging" module is such an example. Of course it is only a warning, not a strong annoyance.
msg147191 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011-11-06 22:46
> Often sequences or sets have heterogeneous keys, mixing and > , and in this case there's no easy way to work with them > without raising this UnicodeWarning. That's a bug in the application. You must not mix byte strings and unicode strings as dictionary keys. Whether or not the warning is produced: the behavior would still be fairly unpredictable, and vary with Python versions. > Of course it is only a warning, not a strong annoyance. This warning is deliberate. It tells the developer that something is broken about this application.
msg147310 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2011-11-08 17:13
Then we'll live with it, or work around when it's possible. For the purists, hopefully we have Python 3 which allows to mix bytes and strings in a set().
History
Date User Action Args
2022-04-11 14:57:23 admin set github: 57569
2011-11-08 17:13:30 flox set status: open -> closedresolution: rejectedmessages: +
2011-11-06 23:24:45 flox unlink issue13356 dependencies
2011-11-06 22:46:37 loewis set messages: +
2011-11-06 21:43:12 flox set messages: + title: UnicodeWarning raised on dict() and set() -> UnicodeWarning raised on sequence and set comparisons
2011-11-06 21:23:50 loewis set nosy: + loewismessages: +
2011-11-06 20:46:08 flox set messages: +
2011-11-06 20:27:12 flox link issue13356 dependencies
2011-11-06 20:22:11 flox create