msg29229 - (view) |
Author: Thomas Arendsen Hein (ThomasAH) |
Date: 2006-07-20 14:22 |
from email.Message import Message from email.Charset import Charset, QP text = "=" msg = Message() charset = Charset("utf-8") charset.header_encoding = QP charset.body_encoding = QP msg.set_charset(charset) msg.set_payload(text) print msg.as_string() Gives MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable =3D With the email package from python2.4.3 and 2.4.4c0 the last '=3D' becomes just '=', so an extra msg.body_encode(text) is needed. |
|
|
msg29230 - (view) |
Author: Thomas Arendsen Hein (ThomasAH) |
Date: 2006-07-20 16:01 |
Logged In: YES user_id=839582 One program which got hit by this is MoinMoin, see http://moinmoin.wikiwikiweb.de/MoinMoinBugs/ResetPasswordEmailImproperlyEncoded |
|
|
msg58248 - (view) |
Author: Roger Demetrescu (rdemetrescu) |
Date: 2007-12-06 16:53 |
I am not sure if it is related, but anyway... MIMEText behaviour has changed from python 2.4 to 2.5. # Python 2.4 >>> from email.MIMEText import MIMEText >>> m = MIMEText(None, 'html', 'iso-8859-1') >>> m.set_payload('abc ' * 50) >>> print m From nobody Thu Dec 6 12:52:40 2007 Content-Type: text/html; charset="iso-8859-1" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc= abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc ab= c abc abc abc abc abc abc abc abc abc abc abc abc=20 # Python 2.5 >>> from email.MIMEText import MIMEText >>> m = MIMEText(None, 'html', 'iso-8859-1') >>> m.set_payload('abc ' * 50) >>> print m From nobody Thu Dec 6 14:46:07 2007 Content-Type: text/html; charset="iso-8859-1" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc However, if we initialize MIMEText with the text, we get the correct output: # python 2.5 >>> from email.MIMEText import MIMEText >>> m = MIMEText('abc ' * 50, 'html', 'iso-8859-1') >>> print m From nobody Thu Dec 6 13:01:17 2007 Content-Type: text/html; charset="iso-8859-1" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc= abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc ab= c abc abc abc abc abc abc abc abc abc abc abc abc=20 If I want to set payload after MIMEText is already created, I need to use this workaround:: #python 2.5 from email.MIMEText import MIMEText m = MIMEText(None, 'html', 'iso-8859-1') m.set_payload(m._charset.body_encode('abc' * 50)) PS: The issue's versions field is filled with "Python 2.4". Shouldn't it be "Python 2.5" ? |
|
|
msg73949 - (view) |
Author: Asheesh Laroia (paulproteus) * |
Date: 2008-09-27 23:59 |
Another way to see this issue is that the email module double-encodes when one attempts to use quoted-printable encoding. This has to be worked around by e.g. MoinMoin. It's easy to get proper base64-encoded output of email.mime.text: >>> mt = email.mime.text.MIMEText('Ta mère', 'plain', 'utf-8') >>> 'Content-Transfer-Encoding: base64' in mt.as_string() True >>> mt.as_string().split('\n')[-2] 'VGEgbcOocmU=' There we go, all nice and base64'd. I can *not* figure out how to get quoted-printable-encoding. I found http://docs.python.org/lib/module-email.encoders.html , so I thought great - I'll just encode my MIMEText object: >>> email.encoders.encode_quopri(mt) >>> 'Content-Transfer-Encoding: quoted-printable' in mt.as_string() True Great! Except it's actually double-encoded, and the headers admit to as much. You see here that, in addition to the quoted-printable header just discovered, there is also a base64-related header, and the result is not strictly QP encoding but QP(base64(payload)). >>> 'Content-Transfer-Encoding: base64' in mt.as_string() True >>> mt.as_string().split('\n')[-2] 'VGEgbcOocmU=3D' It should look like: >>> quopri.encodestring('Ta mère') 'Ta m=C3=A8re' I raised this issue on the Baypiggies list <http://mail.python.org/pipermail/baypiggies/2008-September/003983.html>, but luckily I found this here bug. This is with Python 2.5.2-0ubuntu1 from Ubuntu 8.04. paulproteus@alchemy:~ $ python --version Python 2.5.2 If we can come to a decision as to how this *should* work, I could contribute a patch and/or tests to fix it. I could even perhaps write a new section of the Python documentation of the email module explaining this. |
|
|
msg105045 - (view) |
Author: Thomas Arendsen Hein (ThomasAH) |
Date: 2010-05-05 14:59 |
Roger Demetrescu, I filed the issue with "Python 2.4", because the behavior changed somewhere between 2.4.2 and 2.4.3 The updated link to the MoinMoin bug entry is: http://moinmo.in/MoinMoinBugs/ResetPasswordEmailImproperlyEncoded The workaround I use to be compatible with <= 2.4.2 and >= 2.4.3 is: msg.set_payload('=') if msg.as_string().endswith('='): text = charset.body_encode(text) msg.set_payload(text) |
|
|
msg184663 - (view) |
Author: Colin Su (littleq0903) * |
Date: 2013-03-19 19:04 |
Confirmed with David, we work on this together on sprints. This is not a bug, if you do "set_payload" directly by yourself, you need to encode the payload by yourself because set_payload() doesn't encode payload if 'Content-Transfer-Encoding' did exist. |
|
|
msg184685 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2013-03-19 21:43 |
Reviewing this again, it seems to me that there are two separate issues reported here: (1) set_payload on an existing MIMEText object no longer encodes (but it has now been a long time since it changed). (2) the functions in the encodings module, given an already encoded message, double encode. (1) is now set in stone. That is, it is documented as working this way implicitly if you read the set_payload and set_charset docs and has been working that way for a while now. An explicit note should be added to the MIMEText docs, with a workaround.) (2) could be fixed, I think, since it is unlikely that anyone would be depending on such behavior. |
|
|
msg184692 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2013-03-19 22:22 |
New changeset ba500b179c3a by R David Murray in branch '3.2': #1525919: Document MIMEText+set_payload encoding behavior. http://hg.python.org/cpython/rev/ba500b179c3a New changeset fcbc28ef96a3 by R David Murray in branch '3.3': Merge: #1525919: Document MIMEText+set_payload encoding behavior. http://hg.python.org/cpython/rev/fcbc28ef96a3 New changeset b9e07f20832e by R David Murray in branch 'default': Merge: #1525919: Document MIMEText+set_payload encoding behavior. http://hg.python.org/cpython/rev/b9e07f20832e |
|
|
msg184698 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2013-03-19 22:47 |
I've committed the doc change. I'm going to be lazy and leave this issue open to deal with the encodings module fix. |
|
|
msg408387 - (view) |
Author: Irit Katriel (iritkatriel) *  |
Date: 2021-12-12 14:51 |
The encoding functions are now doing orig = msg.get_payload(decode=True) Does this fix the double-encoding issue? This change was made in https://github.com/python/cpython/commit/00ae435deef434f471e39bea3f3ab3a3e3cd90fe |
|
|
msg408433 - (view) |
Author: Thomas Arendsen Hein (ThomasAH) |
Date: 2021-12-13 09:53 |
Default python3 on Debian buster: $ python3 Python 3.7.3 (default, Jan 22 2021, 20:04:44) [GCC 8.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import email.mime.text >>> mt = email.mime.text.MIMEText('Ta mère', 'plain', 'utf-8') >>> print(mt.as_string()) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 VGEgbcOocmU= >>> email.encoders.encode_quopri(mt) >>> print(mt.as_string()) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Transfer-Encoding: quoted-printable Ta=20m=C3=A8re So the encoded text looks good now, but there are still duplicate headers. Old output (python2.7) is identical to what Asheesh Laroia (paulproteus) reported for python2.5: --- Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Transfer-Encoding: quoted-printable VGEgbcOocmU=3D --- |
|
|