[Python-Dev] email package status in 3.X (original) (raw)

P.J. Eby pje at telecommunity.com
Mon Jun 21 21:14:29 CEST 2010


At 03:08 AM 6/22/2010 +0900, Stephen J. Turnbull wrote:

Barry Warsaw writes:

> Would it make sense to have "encoding-carrying" bytes and str > types?

I think the answer is "no", though, because (1) it would constitute an attractive nuisance (the default would be abused, it would work fine in Kansas, and all hell would break loose in Kagoshima, simply delaying the pain and/or passing it on to third parties),

You have the proposal exactly backwards, actually.

In Kagoshima, you'd use pass in an ebytes with your encoding to a stdlib API, and get back an ebytes with the right encoding, rather than an (incorrect and useless) unicode object which has lost data you need.

Why limit that to bytes and str? Why not have all objects carry their serializer/deserializer around with them?

Because it's not a serialization or deserialization. Your conceptual framework here implies that unicode objects are the real thing, and that bytes are "just" a way of transporting unicode around.

But this is not the case at all, for use cases where "no, really, you have to work with bytes-encoded text streams". The mere release of Python 3.x will not cause all the world's applications, libraries, and protocols to suddenly work with unicode, where they did not before.

Being explicit about the encoding of the bytes you're flinging around is actually an increase in specificity, explicitness, robustness, and error-checking ability over the status quo for either 2.x or 3.x... and it improves these qualities for essentially all string-handling code, without requiring that code to be rewritten to do so.

It's like getting to use the time machine, really.

and (2) you really want this under control of higher level objects that have access to some knowledge of the environment, rather than the lowest level.

This proposal actually has such a higher-level object: an ebytes. And it passes that information through the lowest level, in such a way as to permit the stringlike operations to be fully polymorphic, without the information being lost inside somebody else's API.



More information about the Python-Dev mailing list