[Python-Dev] Unicode charmap decoders slow (original) (raw)
Hye-Shik Chang hyeshik at gmail.com
Wed Oct 5 17:06:06 CEST 2005
- Previous message: [Python-Dev] Unicode charmap decoders slow
- Next message: [Python-Dev] Unicode charmap decoders slow
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 10/5/05, M.-A. Lemburg <mal at egenix.com> wrote:
Of course, a C version could use the same approach as the unicodedatabase module: that of compressed lookup tables...
http://aggregate.org/TechPub/lcpc2002.pdf genccodec.py anyone ?
I had written a test codec for single byte character sets to evaluate algorithms to use in CJKCodecs once before (it's not a direct implemention of you've mentioned, tough) I just ported it to unicodeobject (as attached). It showed relatively fine result than charmap codecs:
% python ./Lib/timeit.py -s "s='a'10241024; u=unicode(s)" "s.decode('iso8859-1')" 10 loops, best of 3: 96.7 msec per loop % ./python ./Lib/timeit.py -s "s='a'10241024; u=unicode(s)" "s.decode('iso8859_10_fc')" 10 loops, best of 3: 22.7 msec per loop % ./python ./Lib/timeit.py -s "s='a'10241024; u=unicode(s)" "s.decode('utf-8')" 100 loops, best of 3: 18.9 msec per loop
(Note that it doesn't contain any documentation nor good error handling yet. :-)
Hye-Shik -------------- next part -------------- A non-text attachment was scrubbed... Name: fastmapcodec.diff Type: application/octet-stream Size: 18814 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20051006/2106c236/fastmapcodec-0001.obj
- Previous message: [Python-Dev] Unicode charmap decoders slow
- Next message: [Python-Dev] Unicode charmap decoders slow
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]