[Python-Dev] email package status in 3.X (original) (raw)
Nick Coghlan ncoghlan at gmail.com
Tue Jun 22 00:03:58 CEST 2010
- Previous message: [Python-Dev] email package status in 3.X
- Next message: [Python-Dev] email package status in 3.X
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, Jun 22, 2010 at 6:16 AM, P.J. Eby <pje at telecommunity.com> wrote:
True, but making it a separate type with a required encoding gets rid of the magical "I don't know" - the "I don't know" encoding is just a plain old bytes object.
So, to boil down the ebytes idea, it is basically a request for a second string type that holds an octet stream plus an encoding name, rather than a Unicode character stream. Calling it "ebytes" seems to emphasise the wrong parallel in that case (you have a 'str' object with a different internal structure, not any kind of bytes object). For now I'll call it an "altstr". Then the idea can be described as
- altstr would expose the same API as str, NOT the same API as bytes
- explicit conversion via "str" would use the altstr's str method
- explicit conversion via "bytes" would use the altstr's bytes method
- implicit interaction with str would convert the str to an altstr object according to the altstr's rules. This may be best handled via a coercion method on altstr, rather than str actually needing to know the details (i.e. an altrstr.coerce_str() method). For the 'ebytes' model, this would do something like "type(self)(other.encode(self.encoding), self.encoding))". The operation would then be handled by the corresponding method on the coerced object. A new type could then override operations such as contains, mod, format() and join().
This is still smelling an awful lot like the 2.x str type to me, but supporting a coerce_str method may allow some useful experimentation in this space (as PJE suggested). There's a chance it would be abused, but it offers a greater chance of success than trying to come up with a concrete altstr type without providing a means for experimentation first.
(In principle, you could then drop all the stringlike methods from plain-old-bytes objects. If it's really text-in-bytes you want, you should use an ebytes with the encoding specified.)
Except that a lot of those string-like methods are just plain useful, even when you know you're dealing with an octet stream rather than latin-1 encoded text.
Cheers, Nick.
-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
- Previous message: [Python-Dev] email package status in 3.X
- Next message: [Python-Dev] email package status in 3.X
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]