[I18n-sig] Re: [Python-Dev] Unicode debate (original) (raw)
M.-A. Lemburg mal@lemburg.com
Tue, 02 May 2000 12:46:06 +0200
- Previous message: [I18n-sig] Re: [Python-Dev] Unicode debate
- Next message: [I18n-sig] Re: [Python-Dev] Unicode debate
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Moshe Zadka wrote:
I'd much prefer Python to reflect a fundamental truth about Unicode, which at least makes sure binary-goop can pass through Unicode and remain unharmed, then to reflect a nasty problem with UTF-8 (not everything is legal).
Let's not do the same mistake again: Unicode objects should not be used to hold binary data. Please use buffers instead.
BTW, I think that this behaviour should be changed:
buffer('binary') + 'data' 'binarydata'
while:
'data' + buffer('binary') Traceback (most recent call last): File "", line 1, in ? TypeError: illegal argument type for built-in operation
IMHO, buffer objects should never coerce to strings, but instead return a buffer object holding the combined contents. The same applies to slicing buffer objects:
buffer('binary')[2:5] 'nar'
should prefereably be buffer('nar').
--
Hmm, perhaps we need something like a data string object to get this 100% right ?!
d = data("...data...") or d = d"...data..." print type(d) <type 'data'>
'string' + d d"string...data..." u'string' + d d"s\000t\000r\000i\000n\000g\000...data..."
d[:5] d"...da"
etc.
Ideally, string and Unicode objects would then be subclasses of this type in Py3K.
-- Marc-Andre Lemburg
Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
- Previous message: [I18n-sig] Re: [Python-Dev] Unicode debate
- Next message: [I18n-sig] Re: [Python-Dev] Unicode debate
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]