[Python-Dev] Internal representation of strings and Micropython (original) (raw)

Jeff Allen ja.py at farowl.co.uk
Wed Jun 4 09:41:12 CEST 2014


Jython uses UTF-16 internally -- probably the only sensible choice in a Python that can call Java. Indexing is O(N), fundamentally. By "fundamentally", I mean for those strings that have not yet noticed that they contain no supplementary (>0xffff) characters.

I've toyed with making this O(1) universally. Like Steven, I understand this to be a freedom afforded to implementers, rather than an issue of conformity.

Jeff Allen

On 04/06/2014 02:17, Steven D'Aprano wrote:

There is a discussion over at MicroPython about the internal representation of Unicode strings. ... My own feeling is that O(1) string indexing operations are a quality of implementation issue, not a deal breaker to call it a Python. I can't see any requirement in the docs that str[n] must take O(1) time, but perhaps I have missed something.



More information about the Python-Dev mailing list