"3.1.3 Unicode Strings" contains: >>> u"äöü" u'\xe4\xf6\xfc' It looks like latin1 is used as encoding for this Unicode literal. This is, however, neither justified by the Python language specification nor by common sense ;-) I'd suggest that this be changed in the tutorial. Furthermore I'd suggest that such use of Unicode literals throw errors instead. I'm assigning it to Martin, because he's one of the Unicode gurus :)
Logged In: YES user_id=163326 I *do* get the warnings when executing scripts, but I do *not* get the warnings at the interactive prompt. Lowering priority accordingly. Feel free to close this if this is indeed the intended behaviour.
Logged In: YES user_id=163326 Oh, I almost forgot about my original problem. Such behaviour shouldn't be encouraged in the tutorial any longer of course. Maybe I can find a better wording and submit a patch soonish. Now that I understand this better, assignnig this to myself :)
Logged In: YES user_id=21627 The intended behaviour is that Unicode in the interactive mode "works", and assumes that the actual input is encoded according to the locale's encoding (i.e. sys.stdin.encoding). That isn't implemented, yet (and perhaps not even specified). However, it is most likely the case that this is a doc bug in ref manual, not in the tutorial. For this to work, the interactive mode needs to be able to pass the encoding to the parser, perhaps giving a Unicode object as the argument instead of a byte string (atleast IDLE might want to do that). Passing Unicode objects to eval/exec is for further study, though.
Logged In: YES user_id=163326 I just read PEP 0263, which says: """ In Python 2.1, Unicode literals can only be written using the Latin-1 based encoding "unicode-escape". """ So the tutorial is correct and the things Martin suggested are only possible improvements to Python, there's no actual bug here. So I'll close this one.