[Python-Dev] Adding Japanese Codecs to the distro (original) (raw)
Atsuo Ishimoto ishimoto@axissoft.co.jp
Thu, 16 Jan 2003 20:08:21 +0900
- Previous message: [Python-Dev] Adding Japanese Codecs to the distro
- Next message: [Python-Dev] Adding Japanese Codecs to the distro
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hello from Japan,
On 16 Jan 2003 11:05:55 +0100 martin@v.loewis.de (Martin v. Lvwis) wrote:
"M.-A. Lemburg" <mal@lemburg.com> writes:
> Thoughts ? I'm in favour of adding support for Japanese codecs, but I wonder whether we shouldn't incorporate the C version of the Japanese codecs package instead, despite its size.
I also vote for JapaneseCodec. Talking about it's size, JapaneseCodec package is much lager because it contains both C version and pure Python version. Size of C version part of JapaneseCodec is about 160kb(compiled on Windows platform), and I don't think it makes difference.
If Suzuki's code is incorporated, I'd like to get independent confirmation that it is actually correct. I know Tamito has taken many iterations until it was correct, where "correct" is a somewhat fuzzy term, since there are some really tricky issues for which there is no single one correct solution (like whether \x5c is a backslash or a Yen sign, in these encodings).
Yes, Tamito's JapaneseCodec has been used for years by many Japanese users, while I've never heard about Suzuki's one.
mapping tables are extracted from Java, through Jython.
I also dislike absence of the cp932 encoding in Suzuki's codecs. The suggestion to equate this to "mbcs" on Windows is not convincing, as a) "mbcs" does not mean cp932 on all Windows installations, and b) cp932 needs to be processed on other systems, too.
Agreed.
I think cp932 could be implemented as a delta to shift-jis, as shown in
http://hp.vector.co.jp/authors/VA003720/lpproj/test/cp932sj.htm (although I wonder why they don't list the backslash issue as a difference between shift-jis and cp932)
http://www.ingrid.org/java/i18n/unicode-utf8.html may be better reference. This page is written in English with utf-8.
Atsuo Ishimoto ishimoto@gembook.org Homepage:http://www.gembook.jp
- Previous message: [Python-Dev] Adding Japanese Codecs to the distro
- Next message: [Python-Dev] Adding Japanese Codecs to the distro
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]