[Python-Dev] PEP 414 - Unicode Literals for Python 3 (original) (raw)
Vinay Sajip vinay_sajip at yahoo.co.uk
Tue Feb 28 07:56:31 CET 2012
- Previous message: [Python-Dev] PEP 414 - Unicode Literals for Python 3
- Next message: [Python-Dev] PEP 414 - Unicode Literals for Python 3
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
R. David Murray <rdmurray bitdance.com> writes:
The rationale claims there's no way to spell "native string" if you use unicodeliterals, which is not true.
It would be different from u('') in that I would expect that there are far fewer instances where 'native string' is required than there are places where unicode strings work (and should therefore be preferred).
A couple of people have said that 'native string' is spelt 'str', but I'm not sure that's the right answer. For example, 2.x's cString.StringIO expects native strings, not Unicode:
from cStringIO import StringIO s = StringIO(u'\xe9') s <cStringIO.StringI object at 0x232de40> s.getvalue() '\xe9\x00\x00\x00'
Of course, you can't call str() on that value to get a native string:
str(u'\xe9') Traceback (most recent call last): File "", line 1, in UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 0: ordinal not in range(128)
So I think using str will not give the desired effect in some situations: on Django, I used a function that resolves differently depending on Python version: something like
def native(literal): return literal
on Python 3, and
def native(literal): return literal.encode('utf-8')
on Python 2.
I'm not saying this is the right thing to do for all cases - just that str() may not be, either. This should be elaborated in the PEP.
Regards,
Vinay Sajip
- Previous message: [Python-Dev] PEP 414 - Unicode Literals for Python 3
- Next message: [Python-Dev] PEP 414 - Unicode Literals for Python 3
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]