[Python-Dev] Internal representation of strings and Micropython (original) (raw)

Paul Sokolovsky pmiscml at gmail.com
Wed Jun 4 12:38:57 CEST 2014


Hello,

On Wed, 4 Jun 2014 12:32:12 +1000 Chris Angelico <rosuav at gmail.com> wrote:

On Wed, Jun 4, 2014 at 11:17 AM, Steven D'Aprano <steve at pearwood.info> wrote: > * Having a build-time option to restrict all strings to ASCII-only. > > (I think what they mean by that is that strings will be like > Python 2 strings, ASCII-plus-arbitrary-bytes, not actually ASCII.)

What I was actually suggesting along those lines was that the str type still be notionally a Unicode string, but that any codepoints >127 would either raise an exception or blow an assertion,

That's another reason why people don't like Unicode enforced upon them

Once again, my claim is what MicroPython implements now is more correct

And I'm saying that not to discourage Unicode addition to MicroPython, but to hint that "force-force" approach implemented by CPython3 and causing rage and split in the community is not appreciated.

and all the code to handle multibyte representations would be compiled out. So there'd still be a difference between strings of text and streams of bytes, but all encoding and decoding to/from ASCII-compatible encodings would just point to the same bytes in RAM.

Risk: Someone would implement that with assertions, then compile with assertions disabled, test only with ASCII, and have lurking bugs. ChrisA

-- Best regards, Paul mailto:pmiscml at gmail.com



More information about the Python-Dev mailing list