[Python-Dev] Unicode filenames (original) (raw)

Guido van Rossum guido@python.org
Sun, 09 Feb 2003 08:20:30 -0500


MacOSX fully supports unicode filenames (utf-8 is used throughout), and I'm tempted to set PyFileSystemDefaultEncoding to "utf8" for OSX. Jack pointed me to a long thread about unicode filenames that took place on python-dev last year, but I can't deduce from it whether there are any disadvantages of setting PyFileSystemDefaultEncoding.

Setting it seems to work wonderful. However, I'm a bit surprised that os.listdir() doesn't return unicode strings. Is that because it would break too much code?

I think that's shallow: the special-casing of unicode_file_names() only exists in the Windows branch of the code.

BTW. if I try to create a file with an 8-bit filename which is not valid utf-8, I get a strange error:

>>> f = open("\xff", "w") Traceback (most recent call last): File "", line 1, in ? IOError: invalid mode: w >>> This exception is thrown when errno is EINVAL, which apparently can also mean that the filename arg is bad. Not sure if we can fix this.

I think we should (maybe we already do) check the mode string more carefully ourselves, and not rely on undocumented correlations between error returns.

--Guido van Rossum (home page: http://www.python.org/~guido/)