[Python-Dev] Multilingual programming article on the Red Hat Developer blog (original) (raw)

Akira Li 4kir4.1i at gmail.com
Wed Sep 17 07:10:08 CEST 2014


Steven D'Aprano <steve at pearwood.info> writes:

On Wed, Sep 17, 2014 at 11:14:15AM +1000, Chris Angelico wrote:

On Wed, Sep 17, 2014 at 5:29 AM, R. David Murray <rdmurray at bitdance.com> wrote:

> Basically, we are pretending that the each smuggled > byte is single character for string parsing purposes...but they don't > match any of our parsing constants. They are all "any character" matches > in the regexes and what have you.

This is slightly iffy, as you can't be sure that one byte represents one character, but as long as you don't much care about that, it's not going to be an issue. This discussion would probably be a lot more easy to follow, with fewer miscommunications, if there were some examples. Here is my example, perhaps someone can tell me if I'm understanding it correctly. I want to send an email including the header line: 'Subject: “NOBODY expects the Spanish Inquisition!”'

from email.header import Header h = Header('Subject: “NOBODY expects the Spanish Inquisition!”') h.encode('utf-8') '=?utf-8?q?Subject=3A_=E2=80=9CNOBODY_expects_the_Spanish_Inquisition!?=\n =?utf-8?q?=E2=80=9D?=' h.encode() '=?utf-8?q?Subject=3A_=E2=80=9CNOBODY_expects_the_Spanish_Inquisition!?=\n =?utf-8?q?=E2=80=9D?=' h.encode('ascii') '=?utf-8?q?Subject=3A_=E2=80=9CNOBODY_expects_the_Spanish_Inquisition!?=\n =?utf-8?q?=E2=80=9D?='

-- Akira



More information about the Python-Dev mailing list