[Python-Dev] unicode inconsistency? (original) (raw)
Tim Peters tim.peters at gmail.com
Thu Sep 9 22:59:18 CEST 2004
- Previous message: [Python-Dev] Re: unicode inconsistency?
- Next message: [Python-Dev] unicode inconsistency?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[Tim]
'%s' is documented as "String (converts any python object using str())". It's str(A()) that raises the exception you're seeing, not interpolation.
[Neil]
Shouldn't '%s' % u'\u1234' also raise an exception then?
Yes, but the existence of one undocumented extension isn't sufficient reason to multiply them. The "Unicode exception" here is at least easy to explain. To make your case work, we somehow have to explain that although virtually all ways of invoking str produce an 8-bit encoding of a unicode return value, for some magical reason str.mod does not. The existing "Unicode exception" consists solely of saying "but unicode inputs don't invoke str(), and force the interpolation to get passed to unicode.mod instead".
Yes. I want something like "PyObjectUnicodeOrStr" that would return either a unicode object or a str object. That would make it easier to write code that produces 'str' results if unicode characters don't appear in any of the inputs.
I think biting the Unicode bullet whole is saner, but suit yourself.
Having str methods that can return either 'unicode' or 'str' objects is also very handy (I don't see how you can say that it doesn't make any sense).
Didn't we go thru that last week ? Yes:
[Neil]
[... the same class as today's class ...]
[Martin]
> This class is incorrect: it does not support str().
[Neil]
> Can you be more specific about what is incorrect with the above
> class?
[Martin]
In the default installation, it gives a UnicodeEncodeError.
You didn't respond to that (at least not that I saw), so I assumed you accepted Martin's nag. Having a str that returns a unicode object that the default encoding can't handle is clearly (IMO) begging for trouble.
Perhaps I am on the wrong track. However, if I understand the /F bot correctly, he favours a design that does not force everthing to unicode strings.
Saying it doesn't make sense to have a str method return a Unicode value that can't be encoded as a str isn't asking anyone to force anything to Unicode. str is still trying hard to retain a distinction between str and unicode. PyObject_Unicode() no longer plays along with that distinction, but I (mildly) wish it still did.
- Previous message: [Python-Dev] Re: unicode inconsistency?
- Next message: [Python-Dev] unicode inconsistency?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]