[Python-Dev] RE: Ill-defined encoding for CP875? (original) (raw)
M.-A. Lemburg mal@lemburg.com
Mon, 14 May 2001 11:02:19 +0200
- Previous message: [Python-Dev] RE: Ill-defined encoding for CP875?
- Next message: [Python-Dev] RE: Ill-defined encoding for CP875?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Tim Peters wrote:
[M.-A. Lemburg] > ... > The "right" thing to do here, is to simply remove cp875 > from the test for round-tripping. I'm relieved you think so, since that's what I already did . > It is not the only encoding which fails this test, but it's not > our fault: the codecs were all generated from the original codec > maps at the Unicode.org site. > > If their mappings are broken, we can't do much about it... other > than to ignore the error or remove the codec altogether. On general principle I don't like either of those -- "in the face of ambiguity, refuse the temptation to guess". It's at least surprising to see >>> unicode("?", "cp875").encode("cp875") '\xfd' >>> now, yes? Would it be better if an ambiguous encoding raised an exception in "strict" mode? That is, a third choice is to alert users when they're relying on a broken part of a mapping.
The problem is: which part would raise the exception -- the encoder or the decoder ?
Here are some more options:
sort the items before creating the encoding table from the decoding one (makes the mapping stable)
map keys which have multiple mappings in the encoding table to None -- this causes their usage to raise an exception (undefined mapping)
-- Marc-Andre Lemburg
Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/
- Previous message: [Python-Dev] RE: Ill-defined encoding for CP875?
- Next message: [Python-Dev] RE: Ill-defined encoding for CP875?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]