[Python-Dev] Internal representation of strings and Micropython (original) (raw)

Mark Lawrence breamoreboy at yahoo.co.uk
Wed Jun 4 15:29:51 CEST 2014


On 04/06/2014 11:53, Paul Sokolovsky wrote:

Hello,

On Tue, 3 Jun 2014 22:23:07 -0700 Guido van Rossum <guido at python.org> wrote: [] Never mind disabling assertions -- even with enabled assertions you'd have to expect most Python programs to fail with non-ASCII input.

Then again the UTF-8 option would be pretty devastating too for anything manipulating strings (especially since many Python APIs are defined using indexes, e.g. the re module). If the Unicode is slow (*), then obvious choice is not using Unicode when not needed. Too bad that's a bit hard in Python3, as it enforces Unicode everywhere, and dealing with efficient strings requires prefixing them with funny characters like "b", etc. * If Unicode if slow because it causes heap to bloat and go swap, the choice is still the same.

Where is your evidence that (presumably) CPython unicode is slow? What is your response to this message http://bugs.python.org/issue16061#msg171413 from the bug tracker?

Why not support variable-width strings like CPython 3.4? Because, like good deal of community, we hope that Python4 will get back to reality, and strings will be efficient (both for processing and storage) by default, and niche and marginal "Unicode string" type will be used explicitly (using funny prefixes, etc.), only when really needed.

Where is your evidence that supports the above claim?

Ah, all these not so funny geek jokes about internals of language implementation, hope they didn't make somebody's day dull!

-- --Guido van Rossum (python.org/~guido)

-- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language.

Mark Lawrence


This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com



More information about the Python-Dev mailing list