[Python-Dev] RE: Ill-defined encoding for CP875? (original) (raw)

Tim Peters tim.one@home.com
Tue, 15 May 2001 03:47:16 -0400


[M.-A. Lemburg]

The problem is: which part would raise the exception -- the encoder or the decoder ?

Since I don't yet use any of this stuff for real, I have no idea: seems mostly a question of pragmatics, and I don't have any feel for how cp875 users would view it.

Here are some more options:

* sort the items before creating the encoding table from the decoding one (makes the mapping stable)

If users don't care that round-trip can fail silently, fine.

* map keys which have multiple mappings in the encoding table to None -- this causes their usage to raise an exception (undefined mapping)

If users don't care that they'll get an exception when they try something that can't be round-tripped, fine. Or would this depend on the value of the "errors" argument too? Then it's easier to impose.

There's a theme here : I have no idea how important roundtrip is in Unicode Practice, or even that it's a constant across apps and encodings. If I write a codec to map all ASCII consonants to u"k" and vowels to u"a", I wouldn't care that I can't get "love" back from u"kaka" .