[Python-Dev] Missing FAQ about Python3 and unicode (original) (raw)
Victor Stinner victor.stinner at haypocalc.com
Wed Dec 31 01:49:32 CET 2008
- Previous message: [Python-Dev] test_subprocess and sparc buildbots
- Next message: [Python-Dev] Missing FAQ about Python3 and unicode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi,
Slowly, we get recurrent questions about Python3 and unicode. It's maybe time to start a FAQ? Here is an ugly draft to start it ;-)
(1) Exit on undecodable command line arguments
$ LANG=en_GB.UTF-8 python3.0 test.py $'\xff' Could not convert argument 2 to string$
Is it an expected behaviour? Yes!
Example of the question: http://bugs.python.org/issue3023
(2) Undecodable filenames
os.listdir(str)->str raises an exception on undecodable filenames.
Solution: use os.listdir(bytes)->bytes. To display the filename to the user, use a function like:
import sys def humanFilename(filename): encoding = sys.getfilesystemencoding() return filename.encode(encoding, "replace")
See also http://bugs.python.org/issue3187
(3) Bytes environment variables
Python 3.0 only supports decodable variables for os.environ. Undecodable variables are skipped for the creation of os.environ but original variables still exist at the C level.
$ A=$(echo -e "\xff") B=c ./python Python 3.1a0 (py3k:67973M, Dec 31 2008, 00:51:49)
import os os.environ.get('A'), os.environ.get('B') (None, 'c') retcode=os.system('echo -n $A|hexdump -C') 00000000 ff |.| 00000001 retcode=os.system('echo -n $B|hexdump -C') 00000000 63 |c| 00000001
Discussion to support bytes environment variables: http://mail.python.org/pipermail/python-dev/2008-December/083856.html
-- Victor Stinner aka haypo http://www.haypocalc.com/blog/
- Previous message: [Python-Dev] test_subprocess and sparc buildbots
- Next message: [Python-Dev] Missing FAQ about Python3 and unicode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]