[Python-Dev] Inconsistent Use of Buffer Interface in stringobject.c (original) (raw)
M.-A. Lemburg mal at egenix.com
Mon Oct 24 20:32:22 CEST 2005
- Previous message: [Python-Dev] Inconsistent Use of Buffer Interface instringobject.c
- Next message: [Python-Dev] Inconsistent Use of Buffer Interface in stringobject.c
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Guido van Rossum wrote:
On 10/24/05, Phil Thompson <phil at riverbankcomputing.co.uk> wrote:
I'm implementing a string-like object in an extension module and trying to make it as interoperable with the standard string object as possible. To do this I'm implementing the relevant slots and the buffer interface. For most things this is fine, but there are a small number of methods in stringobject.c that don't use the buffer interface - and I don't understand why.
Specifically... stringcontains() doesn't which means that... MyString("foo") in "foobar" ...doesn't work. s.join(sequence) only allows sequence to contain string or unicode objects. s.strip([chars]) only allows chars to be a string or unicode object. Same for lstrip() and rstrip(). s.ljust(width[, fillchar]) only allows fillchar to be a string object (not even a unicode object). Same for rjust() and center(). Other methods happily allow types that support the buffer interface as well as string and unicode objects. I'm happy to submit a patch - I just wanted to make sure that this behaviour wasn't intentional for some reason. A concern I'd have with fixing this is that Unicode objects also support the buffer API. In any situation where either str or unicode is accepted I'd be reluctant to guess whether a buffer object was meant to be str-like or Unicode-like. I think this covers all the cases you mention here.
This situation is a little better than that: the buffer interface has a slot called getcharbuffer which is what the string methods use in case they find that a string argument is not of type str or unicode.
A few don't, but I guess we could fix this.
str.split(), .[lr]strip() all support the getcharbuffer interface. str.join() currently doesn't. The Unicode object also leaves out a few cases, among those the ones you mentioned. If it's better for inter-op, I guess we should make an effort and let all of them support the getcharbuffer interface.
We need to support this better in Python 3000; but I'm not sure you can do much better in Python 2.x; subclassing from str is unlikely to work for you because then too many places are going to assume the internal representation is also the same as for str.
As first step, I'd suggest to implement the gatcharbuffer slot. That will already go a long way.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, Oct 24 2005)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
- Previous message: [Python-Dev] Inconsistent Use of Buffer Interface instringobject.c
- Next message: [Python-Dev] Inconsistent Use of Buffer Interface in stringobject.c
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]