[Python-Dev] unicode and str (original) (raw)
Tim Peters tim.peters at gmail.com
Mon Aug 30 22:41:10 CEST 2004
- Previous message: [Python-Dev] unicode and __str__
- Next message: [Python-Dev] unicode and __str__
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[Neil Schemenauer]
... The only thing I found in the NEWS file that seemed relevant is this note:
u'%s' % obj will now try obj.unicode() first and fallback to obj.str() if no unicode method can be found. I don't think that describes the behavior difference. Allowing str return unicode strings seems like a pretty noteworthy change (assuming that's what actually happened).
It's confusing. A str method or tp_str type slot can return unicode, but what happens after that depends on the caller. PyObject_Str() and PyObject_Repr() try to encode it as an 8-bit string then. But unicode.mod says "oh, cool -- I'm done".
Also, I'm a little unclear on the purpose of the unicode method. If you can return unicode from str then why would I want to provide a unicode method?
Is the purpose clearer if you purge your mind of the belief that str() (as opposed to str!) can return unicode? Here w/ current CVS:
class A: ... def str(self): return u'a' print A() a type(str(A())) <type 'str'>
class A: ... def str(self): return u'\u1234' print A() Traceback (most recent call last): File "", line 1, in ? UnicodeEncodeError: 'ascii' codec can't encode character u'\u1234' in position 0: ordinal not in range(128)
'%s' % A() Traceback (most recent call last): File "", line 1, in ? UnicodeEncodeError: 'ascii' codec can't encode character u'\u1234' in position 0: ordinal not in range(128)
u'%s' % A() u'\u1234'
So unicode.mod is what's special here, But not sure that helps .
- Previous message: [Python-Dev] unicode and __str__
- Next message: [Python-Dev] unicode and __str__
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]