[Python-Dev] PEP 263 considered faulty (for some Japanese) (original) (raw)
Martin v. Loewis martin@v.loewis.de
12 Mar 2002 20:57:13 +0100
- Previous message: [Python-Dev] PEP 263 considered faulty (for some Japanese)
- Next message: [Python-Dev] PEP 263 considered faulty (for some Japanese)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
SUZUKI Hisao <suzuki611@oki.com> writes:
What we handle in Unicode with Python is often a document file in UTF-16. The default encoding is mainly applied to data from the document.
You should not use the default encoding for reading files. Instead, you should use codecs.open or some such to read in UTF-16 data.
Yes, I mean such things. Please note that u'' is interpreted just literally and we cannot put Japanese characters in string literals legally for now anyway.
One of the primary rationale of the PEP is that you will be able to put arbitrary Japanese characters into u'<whatever-in-euc-jp', and have it work correctly.
>>> unicode("\x00a\x00b\x00c") u'abc'
You should use
unicode("\x00a\x00b\x00c", "utf-16")
instead.
Regards, Martin
- Previous message: [Python-Dev] PEP 263 considered faulty (for some Japanese)
- Next message: [Python-Dev] PEP 263 considered faulty (for some Japanese)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]