[Python-Dev] unicode inconsistency? (original) (raw)

Tim Peters tim.peters at gmail.com
Thu Sep 9 20:44:56 CEST 2004


[Neil Schemenauer]

Perhaps this is more approprate for python-list but I looks like a bug to me. Example code:

class A: def str(self): return u'\u1234' '%s' % u'\u1234' # this works '%s' % A() # this doesn't work It will work if 'A' subclasses from 'unicode' but should not be necessary, IMHO.

You know better than to say "doesn't work". I assume you mean the latter raises UnicodeEncodeError.

Any reason why this shouldn't be fixed?

Didn't we just go thru this, last week or so? PyObject_Str() never returns a unicode (it returns a str). That is, str(A()) raises UnicodeEncodeError, and that's out of interpolation's hands. As Martin said last time, a str method that returns a unicode doesn't make much sense.

I'm not sure you really mean "it will work if 'A' subclasses from 'unicode'" either:

class A(unicode): ... def str(self): ... return u'\u1234' ... '%s' % A() u'' len() 0

That is, A.str is ignored if A subclasses from Unicode. So "doesn't blow up" seems more on-target than "works" -- I don't think you expected an empty Unicode string here.



More information about the Python-Dev mailing list