[Python-Dev] bytes (original) (raw)

[Python-Dev] bytes / unicode

Nick Coghlan ncoghlan at gmail.com
Sun Jun 27 07:53:59 CEST 2010


On Sun, Jun 27, 2010 at 1:49 PM, P.J. Eby <pje at telecommunity.com> wrote:

I just hate the idea that functions taking strings should have to be rewritten to be explicitly type-agnostic.  It seems so un-Pythonic...  like if all the bitmasking functions you'd ever written using 32-bit int constants had to be rewritten just because we added longs to the language, and you had to upcast them to be compatible or something.  Sounds too much like C or Java or some other non-Python language, where dynamism and polymorphy are the special case, instead of the general rule.

The difference is that we have three classes of algorithm here:

Python 2 lumped all 3 classes of algorithm together through the multi-purpose 8-bit str type. The unicode type provided some scope to separate out the second category, but the divisions were rather blurry.

Python 3 forces the first two to be separated by using either octets (bytes/bytearray) or characters (str). There are a very small number of APIs where it is appropriate to be polymorphic, but this is currently difficult due to the need to supply literals of the appropriate type for the objects being operated on.

This isn't ever going to happen automagically due to the need to explicitly provide two literals (one for octet sequences, one for character sequences).

The virtues of a separate poly_str type are that:

  1. It can be simple and implemented in Python, dispatching to str or bytes as appropriate (probably in the strings module)
  2. No chance of impacting the performance of the core interpreter (as builtins are not affected)
  3. Lower impact if it turns out to have been a bad idea

We could talk about this even longer, but the most effective way forward is going to be a patch that improves the URL parsing situation.

Cheers, Nick.

-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia



More information about the Python-Dev mailing list