[Python-Dev] Adding Japanese Codecs to the distro (original) (raw)

Hye-Shik Chang perky@fallin.lv
Thu, 16 Jan 2003 20:38:55 +0900


On Thu, Jan 16, 2003 at 11:05:55AM +0100, Martin v. L?wis wrote:

"M.-A. Lemburg" <mal@lemburg.com> writes:

> Thoughts ? I'm in favour of adding support for Japanese codecs, but I wonder whether we shouldn't incorporate the C version of the Japanese codecs package instead, despite its size.

And, the most important merit that C version have but Pure version doesn't is sharing library texts inter processes. Most modern OSes can share them and C version is even smaller than Python version in case of KoreanCodecs 2.1.x (on CVS)

Here's process status on FreeBSD 5.0/i386 with Python 2.3a1(of 2003-01-15) system.

USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND perky 56713 0.0 1.2 3740 3056 p3 S+ 8:11PM 0:00.08 python : python without any codecs perky 56739 6.3 5.7 15376 14728 p3 S+ 8:17PM 0:04.02 python : python with python.cp949 codec perky 56749 0.0 1.2 3884 3196 p3 S+ 8:20PM 0:00.06 python : python with c.cp949 codec

alice(perky):/usr/pkg/lib/python2.3/site-packages/korean% size _koco.so text data bss dec hex filename 122861 1844 32 124737 1e741 _koco.so

On C codec, processes shares 122861 bytes on system-wide and consumes only 1844 bytes each, besides on Pure codec consumes 12 Mega bytes each. This must concerned very seriously for launching time of have "# encoding: euc-jp" or something CJK encodings.

I would also suggest that it might be more worthwhile to expose platform codecs, which would give us all CJK codecs on a number of major platforms, with a minimum increase in the size of the Python distribution, and with very good performance.

KoreanCodecs is tested on {Free,Net,Open}BSD, Linux, Solaris, HP-UX, Windows{95,98,NT,2000,XP}, Cygwin without any platform #ifdef's. I sure that any CJK codecs can be ported into any platforms that Python is ported.

Regards,

Hye-Shik =)