[Python-3000] Making more effective use of slice objects in Py3k (original) (raw)

Josiah Carlson jcarlson at uci.edu
Thu Aug 31 05:41:24 CEST 2006


Talin <talin at acm.org> wrote:

I know this was shot down before, but I would still like to see a "characters" type - that is, a mutable sequence of wide characters, much like the Java StringBuffer class - to go along with "bytes". From my perspective, it makes perfect sense to have an "array of character" type as well as an "array of byte" type, and since the "array of byte" is simply called "bytes", then by extension the "array of character" type would be called "characters".

If the buffer API offered information about the size of each element, similar to the way the proposed 'array API' is offering, this would just be one of the supportable cases. Views could offer the ability to specify the size of each element during construction (8, 16, or 32 bits), but variant methods for handling everything would need to be constructed.

Of course, both the 'array' and 'list' types already give you that, but "characters" would have additional string-like methods. (However since it is mutable, it would not be capable of producing views.)

The view object I have now supports mutable and resizable objects (like bytes and array).

The 'characters' data type would be particularly optimized for character-at-a-time operations, i.e. building up a string one character at a time. An example use would be processing escape sequences in strings, where you are transforming the escaped string into its non-escaped equivalent.

That is already possible with array.array('H', ...) or array.array('L', ...), depending on the unicode width of your platform. Array performs a more conservative reallocation strategy (1/16 rather than 1/8), but it seems to work well enough. Combine array with wide character support in views, and we could very well have the functionality that you desire.



More information about the Python-3000 mailing list