[Python-Dev] Patch making the current email package (mostly) support bytes (original) (raw)

Stephen J. Turnbull stephen at xemacs.org
Wed Oct 6 05:22:18 CEST 2010

Previous message: [Python-Dev] Patch making the current email package (mostly) support bytes
Next message: [Python-Dev] Patch making the current email package (mostly) support bytes
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Nick Coghlan writes:

if you pass in bytes data and know what you are doing, then you can access that raw bytes data and do your own decoding

At what level, though?

To take an interesting example I used to see frequently:

From: taro at tokyo.jp (Taro Yamada in 8-bit Shift JIS)

So I guess you are suggesting that the email module can RFC 822 parse that, and

Refuse to return the unwrapped (ie, single line) form of the whole field, except as bytes.
Refuse to return the content of the From field, except as bytes.
Return the email address parsed from the From field.
Refuse to return the comment, except as bytes.

That's fine. But suppose I have a private or newly defined header that is structured? Now I have two choices:

Write a version of my private parser for both str (the normal case) and bytes (if accessing the value as str raises)
Always get the bytes and convert them to str (probably using the same .decode('ascii','surrogate-escape') call that email uses but won't let me have the value of!), then use a common str parser. Note that this is more problematic than it looks, since the appropriate base codec may require information from higher-level structures (eg, qp codec tags or a Content-Type header's charset field).

Why should I reproduce email's logic here? I don't care if the default or concise API raises on surrogates in the str value. But I'm pretty sure that I will want to use str values containing surrogates in these contexts (for the same reasons that email module does, for example), rather than work with bytes sometimes and strs sometimes.

Please provide a way to return strs-with-surrogates if I ask for them.

Previous message: [Python-Dev] Patch making the current email package (mostly) support bytes
Next message: [Python-Dev] Patch making the current email package (mostly) support bytes
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list