[Python-Dev] Replacement for array.array('u')? (original) (raw)
Steven D'Aprano steve at pearwood.info
Fri Mar 22 05:24:23 EDT 2019
- Previous message (by thread): [Python-Dev] Replacement for array.array('u')?
- Next message (by thread): [Python-Dev] Replacement for array.array('u')?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Fri, Mar 22, 2019 at 08:31:33PM +1300, Greg Ewing wrote:
A poster on comp.lang.python is asking about array.array('u'). He wants an efficient mutable collection of unicode characters that can be initialised from a string.
According to the docs, the 'u' code is deprecated and will be removed in 4.0, but no alternative is suggested. Why is this being deprecated, instead of keeping it and making it always 32 bits? It seems like useful functionality that can't be easily obtained another way.
I can't answer any of those questions, but perhaps the poster can do this instead:
py> a = array('L', 'ℍℰâѵÿ Ϻεταł'.encode('utf-32be')) py> a array('L', [220266496, 807469056, 3791650816, 1963196416, 4278190080, 536870912, 4194500608, 3036872704, 3288530944, 2969763840, 1107361792])
Getting the string out again is no harder:
py> bytes(a).decode('utf-32be') 'ℍℰâѵÿ Ϻεταł'
But having said that, it would be nice to have an array code which treated the values as single UTF-32 characters:
array('?', ['ℍ', 'ℰ', 'â', 'ѵ', 'ÿ', ' ', 'Ϻ', 'ε', 'τ', 'α', 'ł'])
if for no other reason than it looks nicer than a bunch of 32 bit ints.
-- Steven
- Previous message (by thread): [Python-Dev] Replacement for array.array('u')?
- Next message (by thread): [Python-Dev] Replacement for array.array('u')?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]