[Python-Dev] bytes (original) (raw)

[Python-Dev] bytes / unicode

M.-A. Lemburg mal at egenix.com
Tue Jun 22 20:09:24 CEST 2010


Guido van Rossum wrote:

[Just addressing one little issue here; generally I'm just happy that we're discussing this issue in such detail from so many points of view.]

On Mon, Jun 21, 2010 at 10:50 PM, Toshio Kuratomi <a.badger at gmail.com> wrote: [...] Would urljoin(bbase, bsubdir) => bytes and urljoin(ubase, usubdir) => unicode be acceptable though? (I think, given other options, I'd rather see two separate functions, though. It seems more discoverable and less prone to taking bad input some of the time to have two functions that clearly only take one type of data apiece.) Hm. I'd rather see a single function (it would be "polymorphic" in my earlier terminology). After all a large number of string method calls (and some other utility function calls) already look the same regardless of whether they are handling bytes or text (as long as it's uniform). If the building blocks are all polymorphic it's easier to create additional polymorphic functions. FWIW, there are two problems with polymorphic functions, though they can be overcome: (1) Literals. If you write something like x.split('&') you are implicitly assuming x is text. I don't see a very clean way to overcome this; you'll have to implement some kind of type check e.g. x.split('&') if isinstance(x, str) else x.split(b'&') A handy helper function can be written: def literalas(constant, variable): if isinstance(variable, str): return constant else: return constant.encode('utf-8') So now you can write x.split(literalas('&', x)).

This polymorphism is what we used in Python2 a lot to write code that works for both Unicode and 8-bit strings.

Unfortunately, this no longer works as easily in Python3 due to the literals sometimes having the wrong type and using such a helper function slows things down a lot.

It would be great if we could have something like the above as builtin method:

x.split('&'.as(x))

Perhaps something to discuss on the language summit at EuroPython.

Too bad we can't add such porting enhancements to Python2 anymore.

-- Marc-Andre Lemburg eGenix.com

Professional Python Services directly from the Source (#1, Jun 22 2010)

Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/


2010-07-19: EuroPython 2010, Birmingham, UK 26 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/



More information about the Python-Dev mailing list