[Python-Dev] lone surrogates in utf-8 (original) (raw)
Antoine Pitrou solipsis at pitrou.net
Tue Apr 28 15:13:37 CEST 2009
- Previous message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Next message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hrvoje Niksic <hrvoje.niksic avl.com> writes:
"Should be considered" or "will be considered"? Python 3.0's UTF-8 decoder happily accepts it and returns u'\udcff': >>> b'\xed\xb3\xbf'.decode('utf-8') '\udcff'
Yes, there is already a bug entry for it: http://bugs.python.org/issue3672
I think we could happily fix it for 3.1 (perhaps leaving 2.7 unchanged for compatibility reasons - I don't know if some people may rely on the current behaviour).
- Previous message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Next message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]