[Python-Dev] Internal representation of strings and Micropython (original) (raw)

Stephen J. Turnbull stephen at xemacs.org
Wed Jun 4 11:36:20 CEST 2014


dw+python-dev at hmmz.org writes:

Given the specialized kinds of application this Python implementation is targetted at, it seems UTF-8 is ideal considering the huge memory savings resulting from the compressed representation,

I think you really need to check what the applications are in detail. UTF-8 costs about 35% more storage for Japanese, and even more for Chinese, than does UTF-16. So if you might be using a lot of Asian localized strings, it might even be worth implementing PEP-393 to get the best of both worlds for most strings.



More information about the Python-Dev mailing list