but disallowing them in higher level
">

(original) (raw)

but disallowing them in higher level

> explicitly cross platform abstractions like pathlib.


I think the trick here is that posix-using folks claim that filenames are just bytes, and indeed they can be passed around with a char*, so they seem to be.


but you can't possible do anything other than pass them around if you REALLY think they are just bytes.

So really, people treat them as "bytes-in-some-arbitrary-encoding-where-at-least the-slash-character-(and maybe a couple others)-is-ascii-compatible"

If you assume that, then you could write a pathlib that would work. And in practice, I expect a lot of designed only for posix code works that way. But of course, this gets ugly if you go to a platform where filenames are not "bytes-in-some-arbitrary-encoding-where-at-least the-slash-character-(and maybe a couple others)-is-ascii-compatible", like windows.

I'm not sure if it's worth having a pathlib, etc. that uses this assumption -- but it could help us all write code that actually works with this screwy lack of specification.

Antoine Pitrou wrote:

To elaborate specifically about pathlib, it doesn't handle bytes paths
but allows you to generate them if desired:

https://docs.python.org/3/library/pathlib.html#operators


but that uses

os.fsencode: Encode filename to the filesystem encoding

As I understand it, the whole problem with some posix systems is that there is NO filesystem encoding -- i.e. you can't know for sure what encoding a filename is in. So you need to be able to pass the bytes through as they are.

(At least as I read Armin Ronacher's blog)

-Chris


--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division

NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax

Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@noaa.gov