[Python-Dev] email package status in 3.X (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Tue Jun 22 00:03:58 CEST 2010

Previous message: [Python-Dev] email package status in 3.X
Next message: [Python-Dev] email package status in 3.X
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Jun 22, 2010 at 6:16 AM, P.J. Eby <pje at telecommunity.com> wrote:

True, but making it a separate type with a required encoding gets rid of the magical "I don't know" - the "I don't know" encoding is just a plain old bytes object.

So, to boil down the ebytes idea, it is basically a request for a second string type that holds an octet stream plus an encoding name, rather than a Unicode character stream. Calling it "ebytes" seems to emphasise the wrong parallel in that case (you have a 'str' object with a different internal structure, not any kind of bytes object). For now I'll call it an "altstr". Then the idea can be described as

altstr would expose the same API as str, NOT the same API as bytes
explicit conversion via "str" would use the altstr's str method
explicit conversion via "bytes" would use the altstr's bytes method
implicit interaction with str would convert the str to an altstr object according to the altstr's rules. This may be best handled via a coercion method on altstr, rather than str actually needing to know the details (i.e. an altrstr.coerce_str() method). For the 'ebytes' model, this would do something like "type(self)(other.encode(self.encoding), self.encoding))". The operation would then be handled by the corresponding method on the coerced object. A new type could then override operations such as contains, mod, format() and join().

This is still smelling an awful lot like the 2.x str type to me, but supporting a coerce_str method may allow some useful experimentation in this space (as PJE suggested). There's a chance it would be abused, but it offers a greater chance of success than trying to come up with a concrete altstr type without providing a means for experimentation first.

(In principle, you could then drop all the stringlike methods from plain-old-bytes objects. If it's really text-in-bytes you want, you should use an ebytes with the encoding specified.)

Except that a lot of those string-like methods are just plain useful, even when you know you're dealing with an octet stream rather than latin-1 encoded text.

Cheers, Nick.

-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia

Previous message: [Python-Dev] email package status in 3.X
Next message: [Python-Dev] email package status in 3.X
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list