[Python-Dev] Edits to Metadata 1.2 to add extras (optional ependencies) (original) (raw)

Stephen J. Turnbull stephen at xemacs.org
Sat Sep 1 06:55:11 CEST 2012


"Martin v. Löwis" writes:

Unfortunately, this conflicts with the desire to use UTF-8 in attribute values - RFC 822 (and also 2822) don't support this, but require the use oF MIME instead (Q or B encoding).

This can be achieved simply by extending the set of characters permitted, as MIME did for message bodies. I'd be cautious about RFC 5335, not just because it's experimental, but because there may be other requirements we don't want to mess with. (If RDM says otherwise, listen to him. I just know the RFC exists.)

RFC 2822 also has a continuation line semantics which traditionally conflicts with the metadata; in particular, line breaks cannot be represented (but are interpreted as continuation lines instead).

Of course line breaks can be represented, without any further change to RFC 2822. Just use Unicode LINE SEPARATOR. You could even do it within ASCII by adhering strictly to RFC 2822 syntax which interprets continuation lines by removing exactly the CRLF pair. Just use ASCII TAB as the field separator.

There's a final dodge that occurs to me: the semantics you're talking about are lexical semantics in the RFC 2822 context (line unfolding and RFC 2047 decoding). We could possibly in the context of the email module treat Metadata as an intermediate post-lexical-decoding pre-syntactic-analysis representation. I don't know if that makes sense in the context of using email module facilities to parse Metadata.

Steve



More information about the Python-Dev mailing list