[Python-Dev] Bytes path support (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Thu Aug 21 14:54:36 CEST 2014


Am 19.08.14 19:43, schrieb Ben Hoyt:

The official policy is that we want them [support for bytes paths in stdlib functions] to go away, but reality so far has not budged. We will continue to hold our breath though. :-)

Does that mean that new APIs should explicitly not support bytes? I'm thinking of os.scandir() (PEP 471), which I'm implementing at the moment. I was originally going to make it support bytes so it was compatible with listdir, but maybe that's a bad idea. Bytes paths are essentially broken on Windows. Bytes paths are "essential" on Unix, though, so I don't think we should create new low-level APIs that don't support bytes. Fair enough. I don't quite understand, though -- why is the "official policy" to kill something that's "essential" on *nix?

I think the people defending the "Unix file names are just bytes" side often miss an important detail: displaying file names to the user, and allowing the user to enter file names.

A script that just needs to traverse a directory tree and look at files by certain criteria can easily do so with not worrying about a text interpretation of the file names.

When it comes to user interaction, it becomes apparent that, even on Unix, file names are not just bytes. If you do "ls -l" in your shell, the "system" (not just the kernel - but ultimately the terminal program, which might be the console driver, or an X11 application) will interpret the file name as having an encoding, and render them with a font.

So for Python, the question is: which of the use cases (processing all files, vs. showing them to the user) should be better supported? Python 3 took the latter as an answer, under the assumption that this is the more common case.

Regards, Martin



More information about the Python-Dev mailing list