Issue 713820: iconv_codec NG - Python tracker (original) (raw)

This new implementation of iconv_codec resolves problems of current implementations below:

simplified chinese

"euc_cn": "zh_CN.euc", "iso_2022_zh": "zh_CN.iso2022-CN", "gbk": "zh_CN.gbk", "cp935": "zh_CN-cp935",

traditional chinese

"euc_tw": "zh_TW.euc", "iso_2022_tw": "zh_TW.iso2022-7", "big5": "zh_TW.big5", "cp937": "zh_TW.cp937",

japanese

"iso_2022_jp": "ISO-2022-JP", "euc_jp": "eucJP", "shift_jis": "PCK",

korean

"euc_kr": "ko_KR.euc", "iso_2022_kr": "ISO-2022-KR", "johab": "ko_KR.johap", "cp932": "ko_KR.cp932", "cp949": "ko_KR.cp949",

And, many multibyte codecs such as CJK or iconv might have duplicated code for processing error callbacks and handling Streams. So, I splitted them out to another source. CJK and iconv codecs can share them just in source level by putting multibytecodec.c to Modules/ and linking the file to each of the codecs. Alternatively, if multibytecodec.c goes to Python/ and is linked to main python library, the codecs can be compiled and loaded by themselves. multibytecodec.c, the common multibyte codec framework can be used by any usual multibyte encodings. By using it, some codec writer can create a codec for his/her multibyte encodings without any care for handling error callbacks or implementing StreamReader structure. I wrote CJK codecs using it. and will submit a patch in an individual patch report.