[Python-Dev] methods on the bytes object (was: Crazy idea for str.join) (original) (raw)

Josiah Carlson jcarlson at uci.edu
Sun Apr 30 12:01:11 CEST 2006


"Guido van Rossum" <guido at python.org> wrote:

On 4/29/06, Josiah Carlson <jcarlson at uci.edu> wrote: > I understand the underlying implementation of str.join can be a bit > convoluted (with the auto-promotion to unicode and all), but I don't > suppose there is any chance to get str.join to support objects which > implement the buffer interface as one of the items in the sequence?

In Py3k, buffers won't be compatible with strings -- buffers will be about bytes, while strings will be about characters. Given that future I don't think we should mess with the semantics in 2.x; one change in the near(ish) future is enough of a transition.

This brings up something I hadn't thought of previously. While unicode will obviously keep its .join() method when it becomes str in 3.x, will bytes objects get a .join() method? Checking the bytes PEP, very little is described about the type other than it basically being an array of 8 bit integers. That's fine and all, but it kills many of the parsing and modification use-cases that are performed on strings via the non xxx methods.

Specifically in the case of bytes.join(), the current common use-case of .join(...) would become something similar to bytes().join(...), unless bytes objects got a syntax... Or maybe I'm missing something?

Anyways, when the bytes type was first being discussed, I had hoped that it would basically become array.array("B", ...) + non-unicode str. Allowing for bytes to do everything that str was doing before, plus a few new tricks (almost like an mmap...), minus those operations which require immutability.



More information about the Python-Dev mailing list