[Python-Dev] iso-2022 and issue 7472: question for the experts (original) (raw)

Stephen J. Turnbull turnbull at sk.tsukuba.ac.jp
Wed Apr 7 21:22:02 CEST 2010


R. David Murray writes:

A long time ago (in a galaxy far far...no, wrong show)

Er, as I was saying, a long time ago Barry applied a patch to email that went more or less like this:

ndex: email/Encoders.py

--- email/Encoders.py (revision 35918) +++ email/Encoders.py (revision 35919) @@ -84,7 +83,13 @@ try: orig.encode('ascii') except UnicodeError:

This comment may be inaccurate. The ISO 2022 family includes what are normally "8bit" encodings such as the EUC family and ISO 8859. I don't know whether there are any IANA-registered 8bit charsets with names that start with 'iso-2022-', and AFAIK there are none in Python. (There is an 'iso-2022-8' encoding in Emacs, though.) Still, I'd be more comfortable with an explicit list than with the .startswith('iso-2022-') idiom.

Reading the standards, it looks to me like either the ISO-2022 input will be 7-bit, and the except will not trigger, or it will be invalid, because 8bit, and so should be set to 8bit just like all the other cases where there's invalid 8bit data. So I think this patch should just be reverted.

I have nothing to add to what Martin said about the basic analysis.

It would be possible to just unconditionally set the Content-Transfer-Encoding to 8bit, although that may violate a SHOULD in the MIME standard.



More information about the Python-Dev mailing list