msg75784 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-11-12 13:16 |
I never used the email package, so my issue is maybe not a bug. I'm trying to send an email with diacritics in the subject and the body. I'm french so it's natural to use characters not in the ASCII range. I wrote this small program: def main(): # coding: utf8 ADDRESS = 'victor.stinner@haypocalc.com' from email.mime.text import MIMEText msg = MIMEText('accent éôŁ', 'plain', 'utf-8') msg['Subject'] = 'sujet éôł' msg['From'] = ADDRESS msg['To'] = ADDRESS text = msg.as_string() print("--- FLATTEN ---") print(text) return import smtplib client=smtplib.SMTP('smtp.free.fr') client.sendmail(ADDRESS, ADDRESS, text) client.quit() main() (remove the "return" to really send the email) The problem: (...) File "/home/haypo/prog/py3k/Lib/email/generator.py", line 141, in _write_headers header_name=h, continuation_ws='\t') File "/home/haypo/prog/py3k/Lib/email/header.py", line 189, in __init__ self.append(s, charset, errors) File "/home/haypo/prog/py3k/Lib/email/header.py", line 262, in append input_bytes = s.encode(input_charset, errors) UnicodeEncodeError: 'ascii' codec can't encode characters in position 6-8: ordinal not in range(128) I don't understand why it uses ASCII whereas I specified that I would like to use the UTF-8 charset. My attached patch reused the message charset to encode the headers, but use ASCII if the header can be encoded as ASCII. The patch included an unit test. |
|
|
msg75785 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-11-12 13:30 |
The first email example (the one using a file in the library documentation) opens a text in binary mode and use the ASCII charset. It's quite strange because I expect an text to use only characters, something like: charset = 'ASCII' # Create a text/plain message with open(textfile, 'r', encoding=charset) as fp: msg = MIMEText(fp.read(), 'plain', charset) ... and the example doesn't work: Traceback (most recent call last): File "y.py", line 11, in msg = MIMEText(fp.read()) File "/home/haypo/prog/py3k/Lib/email/mime/text.py", line 30, in __init__ self.set_payload(_text, _charset) File "/home/haypo/prog/py3k/Lib/email/message.py", line 234, in set_payload self.set_charset(charset) File "/home/haypo/prog/py3k/Lib/email/message.py", line 269, in set_charset cte(self) File "/home/haypo/prog/py3k/Lib/email/encoders.py", line 60, in encode_7or8bit orig.encode('ascii') AttributeError: 'bytes' object has no attribute 'encode' Solutions: - Message.set_payload() have to block type different than str => or would it be possible to use bytes as payload??? - Fix the example to use characters The new attached patch fixes the example and check type in Message.set_payload(). |
|
|
msg75786 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-11-12 14:24 |
"Please make this a release blocker and I will look at it this weekend. -Barry" |
|
|
msg76114 - (view) |
Author: Barry A. Warsaw (barry) *  |
Date: 2008-11-20 16:15 |
This example works though, and it also works in earlier Pythons. from email.header import Header def main(): # coding: utf8 ADDRESS = 'victor.stinner@haypocalc.com' from email.mime.text import MIMEText msg = MIMEText('accent \xe9\xf4\u0142', 'plain', 'utf-8') msg['Subject'] = Header('sujet \xe9\xf4\u0142'.encode('utf-8'), 'utf-8') msg['From'] = ADDRESS msg['To'] = ADDRESS text = msg.as_string() print("--- FLATTEN ---") print(text) return main() |
|
|
msg76115 - (view) |
Author: Barry A. Warsaw (barry) *  |
Date: 2008-11-20 16:21 |
I'm rejecting the patch because the old way of making this work still works in Python 3.0. Any larger changes to the API need to be made in the context of redesigning the email package to be byte/str aware. |
|
|
msg76143 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-11-20 22:07 |
> I'm rejecting the patch because the old way of making > this work still works in Python 3.0. I checked the documentation and there is a section about "email: Internationalized headers". I didn't read this section. I just expected that Python uses the right encoding beacuse it was already specified in the MIMEText() constructor... > Any larger changes to the API need to be made in > the context of redesigning the email package to be byte/str aware. Right. |
|
|
msg76145 - (view) |
Author: Barry A. Warsaw (barry) *  |
Date: 2008-11-20 22:48 |
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 20, 2008, at 5:07 PM, STINNER Victor wrote: > STINNER Victor <victor.stinner@haypocalc.com> added the comment: > >> I'm rejecting the patch because the old way of making >> this work still works in Python 3.0. > > I checked the documentation and there is a section about "email: > Internationalized headers". I didn't read this section. I just > expected that Python uses the right encoding beacuse it was already > specified in the MIMEText() constructor... Yes. This is a stupid API (tm). :) - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSSXpJHEjvBPtnXfVAQKfOAP9G2BSPKIPTVTeo5k3rovqGbYSCB23SK+P +YHInZY2NTikFUgJec4EvWvvuTkW77nb5kxVTb+MlQJMAN//AOy8xvHsFUae4F8Y P9DsDMb3MhKokr/Y1gZyxlpHhXiK5r6aEh9+cWrujXbf9gwtYWmeiKl6MoZkOWYA 3H9gASFvuUI= =mapP -----END PGP SIGNATURE----- |
|
|