[Python-Dev] Internal representation of strings and Micropython (original) (raw)

Paul Sokolovsky pmiscml at gmail.com
Wed Jun 4 16:49:30 CEST 2014


Hello,

On Thu, 5 Jun 2014 00:26:10 +1000 Chris Angelico <rosuav at gmail.com> wrote:

On Thu, Jun 5, 2014 at 12:17 AM, Serhiy Storchaka <storchaka at gmail.com> wrote: > 04.06.14 10:03, Chris Angelico написав(ла): > >> Right, which is why I don't like the idea. But you don't need >> non-ASCII characters to blink an LED or turn a servo, and there is >> significant resistance to the notion that appending a non-ASCII >> character to a long ASCII-only string requires the whole string to >> be copied and doubled in size (lots of heap space used). > > > But you need non-ASCII characters to display a title of MP3 track.

Yes, but to display a title, you don't need to do codepoint access at random - you need to either take a block of memory (length in bytes) and do something with it (pass to a C function, transfer over some bus, etc.), or iterate in order over codepoints in a string. All these operations are as efficient (O-notation) for UTF-8 as for UTF-32.

Some operations are not going to be as fast, so - oops - avoid doing them without good reason. And kindly drop expectations that doing arbitrary operations on Unicode are as efficient as you imagined. (Note the Unicode in general, not particular flavor of which you got used to, up to thinking it's the one and only "right" flavor.)

Agreed. IMO, any Python, no matter how micro, needs full Unicode support; but there is resistance from uPy's devs.

FUD ;-).

ChrisA

-- Best regards, Paul mailto:pmiscml at gmail.com



More information about the Python-Dev mailing list