| msg132192 - (view) |
Author: Thomas Kluyver (takluyver) * |
Date: 2011-03-26 00:28 |
| To replicate, in Python 3.1 on Linux (utf-8 console): >>> print(chr(0x9000)) 退 Copy and paste this character into the prompt. It appears correctly (as a Chinese character). Then: >>> import readline >>> readline.parse_and_bind('"\M-i":" "') Now try to paste the character again: it appears as " ��" (four spaces, two unknown character symbols), and if you press return, you get a SyntaxError. This happens with all characters beginning with \xe9: In UTF-8, that's 0x9000-0x9fff. If the terminal encoding is changed to cp1252, I'm told that the same thing can be achieved with é, which is \xe9 there. |
|
|
| msg141435 - (view) |
Author: Petri Lehtinen (petri.lehtinen) *  |
Date: 2011-07-30 10:41 |
| You're binding the M-i keyboard sequence. Could it be that the \xe9 byte is translated by the terminal to M-i, and that causes the interference? In this case, it's not really a bug. |
|
|
| msg175836 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2012-11-18 00:15 |
| Original bug report: https://github.com/ipython/ipython/issues/58 |
|
|
| msg175837 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2012-11-18 00:26 |
| I confirm that the issue exists, but I don't think that it comes from Python. I bet that the readline library uses *byte* string, not *character* string, and so is unable to handle correctly multibyte characters like the chinese character U+9000. |
|
|
| msg175854 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-11-18 10:06 |
| Yes, this is a readline issue. Add '"\M-i":" "' line to ~/.inputrc, run 'rlwrap cat' command, paste this multibyte character and you got the same result. This is not a Python bug. |
|
|
| msg175954 - (view) |
Author: Thomas Kluyver (takluyver) * |
Date: 2012-11-19 10:45 |
| OK, thanks, and sorry for the noise. I've closed this issue. Looking at the readline manual, it looks like this is tied up with the options input-meta, output-meta and convert-meta. Fiddling around with .inputrc hasn't clarified exactly what they do, but it seems that the terminal can either handle unicode, or shortcuts involving meta (alt), but not both. |
|
|