Issue 742290: unicode "support" for shlex.py (original) (raw)

Created on 2003-05-23 12:47 by jvr, last changed 2022-04-10 16:08 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
shlex.patch jvr,2003-05-23 12:48 shlex.py unicode "support"
Messages (4)
msg43824 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2003-05-23 12:47
Due to shlex.py's use of cStringIO, it behaves badly when fed unicode strings. The attached patch fixes that by always using StringIO instead of cStringIO.
msg43825 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2003-05-23 12:57
Logged In: YES user_id=92689 Ugh, I take that back: it doesn't fix it, there's a gross snippet in shlex.py that makes it barf: if self.posix: self.wordchars += ('??·???ÂÊÁËÈÍÎÏÌÓÔ?ÒÚÛÙ??¯???¸???' '¿¡¬????«»? ÀÃÕ????????÷ÿ??????') Help. I'd love to fix this, but I'm not sure what would be correct (my intuition says to just yank the above snippet, but I'm sure that'll make _someone_ unhappy...).
msg43826 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-05-24 12:32
Logged In: YES user_id=21627 To test whether a letter is a wordchar, you should check whether it .isalnum() or equals '_'. Then you can do away with self.wordchars, and it works the same for byte strings and Unicode strings. Non-ASCII characters in byte strings then work if locale.setlocale had been invoked.
msg43827 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-07-07 22:07
Logged In: YES user_id=21627 I'll reject that patch for now. If you manage to complete it, feel free to reopen or submit a new one.
History
Date User Action Args
2022-04-10 16:08:51 admin set github: 38540
2003-05-23 12:47:58 jvr create