[Python-Dev] unicode and str (original) (raw)

Neil Schemenauer nas at arctrix.com
Tue Aug 31 00:38:52 CEST 2004


On Mon, Aug 30, 2004 at 11:35:17PM +0200, "Martin v. L?wis" wrote:

Neil Schemenauer wrote: >But unicode() will also return str, eg. > > >>> class A: > ... def str(self): > ... return u'\u1234' > ... > >>> unicode(A()) > u'\u1234' > >Why would I want to use unicode?

This class is incorrect: it does not support str().

Forgive me if I'm being obtuse, but I'm trying to understand the overall Python unicode design. This works:

>>> sys.getdefaultencoding()
'utf-8'
>>> str(A())
'\xe1\x88\xb4'

Can you be more specific about what is incorrect with the above class?

>Shouldn't we be heading to a >world where str always returns unicode objects?

No. In some cases, str() needs to compromise, where unicode() doesn't.

Sorry, I don't understand that statement. Are you saying that we will eventually get rid of str and only have unicode?

[on having str assume some character encoding]

Perhaps. What are you proposing to do about this? Ban, from the face of the earth, what seems like a horrible design to you?

If only we could. :-) Seriously though, I'm trying to understand the point of unicode. To me it seems to make the transition to unicode string needlessly more complicated.

Neil



More information about the Python-Dev mailing list