Issue 22929: cp874 encoding almost empty (original) (raw)

Issue22929

Created on 2014-11-24 10:39 by era, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (4)
msg231596 - (view) Author: (era) Date: 2014-11-24 10:39
I created a simple script to map character codes in the 8bit range to Unicode for simple lookup: https://github.com/tripleee/8bit In the generated output, on Python 2.6.6 (but corroborated on Python 2.7.6), almost all character codes come up as "undefined" in CP874. According to http://en.wikipedia.org/wiki/ISO/IEC_8859-11, CP874 should be a superset of ISO-8859-11, with a few character codes *added* in the ISO control range.
msg231598 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2014-11-24 11:02
I'm not sure I understand the bug report. What's the problem ? :-) The codec is a charmap codec generated from the file MAPPINGS/VENDORS/MICSFT/WINDOWS/CP874.TXT (http://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP874.TXT) This mapping does have quite a few undefined characters.
msg231599 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2014-11-24 11:09
BTW: The table on the wiki page shows the same undefined chars.
msg231600 - (view) Author: (era) Date: 2014-11-24 11:47
My apologies -- I already attemptd to close this as a mistake on my part, but apparently, that failed too. )-: Sorry.
History
Date User Action Args
2022-04-11 14:58:10 admin set github: 67118
2014-11-24 14:39:36 r.david.murray set stage: resolved
2014-11-24 11:47:09 era set status: open -> closedresolution: not a bugmessages: +
2014-11-24 11:09:08 lemburg set messages: +
2014-11-24 11:02:43 lemburg set messages: +
2014-11-24 10:43:24 vstinner set nosy: + lemburg
2014-11-24 10:39:37 era create