(original) (raw)
but disallowing them in higher level
> explicitly cross platform abstractions like pathlib.
I think the trick here is that posix-using folks claim that filenames are just bytes, and indeed they can be passed around with a char*, so they seem to be.
but you can't possible do anything other than pass them around if you REALLY think they are just bytes.
So really, people treat them as "bytes-in-some-arbitrary-encoding-where-at-least the-slash-character-(and maybe a couple others)-is-ascii-compatible"
If you assume that, then you could write a pathlib that would work. And in practice, I expect a lot of designed only for posix code works that way. But of course, this gets ugly if you go to a platform where filenames are not "bytes-in-some-arbitrary-encoding-where-at-least the-slash-character-(and maybe a couple others)-is-ascii-compatible", like windows.
I'm not sure if it's worth having a pathlib, etc. that uses this assumption -- but it could help us all write code that actually works with this screwy lack of specification.
Antoine Pitrou wrote:
To elaborate specifically about pathlib, it doesn't handle bytes paths
but allows you to generate them if desired:
but that uses
os.fsencode: Encode filename to the filesystem encoding
As I understand it, the whole problem with some posix systems is that there is NO filesystem encoding -- i.e. you can't know for sure what encoding a filename is in. So you need to be able to pass the bytes through as they are.
(At least as I read Armin Ronacher's blog)
-Chris
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov