email: invalid RFC 2047 address header after refolding with email.policy.default · Issue #121284 · python/cpython (original) (raw)

Bug report

Bug description:

If an email message (modern or legacy) is assigned an address header that is pre-encoded with RFC 2047, calling as_bytes(policy=default) can generate an invalid address header. The resulting header may include unquoted RFC 5322 special characters in a way that can alter its meaning.

Here is a minimal example to demonstrate the problem, isolated from much larger code (including Django's django.core.mail.message). Although this example starts with a legacy Message, an example using only the modern email API is in a later comment:

import email.message import email.policy

message = email.message.Message() message["To"] = '=?utf-8?b?TmfGsOG7nWkgbmjhuq1uIGEgdmVyeSB2ZXJ5IGxvbmcs?= name to@example.com'

message.as_bytes(policy=email.policy.default)

b'To: =?utf-8?b?TmfGsOG7nWkgbmjhuq1u?= a very very long, name to@example.com\n\n'

(The unquoted comma in the resulting display-name is not valid.)

For a real-world case where this can occur, see Django ticket 35378 and anymail/django-anymail#369. (Thanks to @andresmrm for noticing the problem and isolating a test case.)

Oddly, as_string(policy=default) doesn't exhibit the problem:

message.as_string(policy=email.policy.default)

'To: =?utf-8?b?TmfGsOG7nWkgbmjhuq1uIGEgdmVyeSB2ZXJ5IGxvbmcs?= name to@example.com\n\n'

Also, the problem does not occur when assigning the non-encoded equivalent to the header:

message2 = email.message.Message() message2["To"] = '"Người nhận a very very long, name" to@example.com' message2.as_bytes(policy=email.policy.default)

b'To:\n =?utf-8?b?TmfGsOG7nWkgbmjhuq1uIGEgdmVyeSB2ZXJ5IGxvbmcs?= name to@example.com\n\n'

(Possibly related to #80222)

CPython versions tested on:

3.9, 3.12

Operating systems tested on:

macOS

[edits: removed ambiguous use of "default" in example comment; clarified this is not a real-world example, but a minimal test case]

Linked PRs