[Python-Dev] Bytes path support (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Thu Aug 21 01:26:51 CEST 2014


On 21 Aug 2014 09:06, "Chris Barker" <chris.barker at noaa.gov> wrote:

As I understand it, the whole problem with some posix systems is that there is NO filesystem encoding -- i.e. you can't know for sure what encoding a filename is in. So you need to be able to pass the bytes through as they are. (At least as I read Armin Ronacher's blog)

Armin lets his astonishment at the idea we'd expect Linux vendors to fix their broken OS get the better of him at times - he thinks the responsibility lies entirely with us to work around its quirks and limitations :)

The "surrogateescape" codec is our main answer to the unreliability of the POSIX encoding model - fsdecode will squirrel away arbitrary bytes in the private use area, and then fsencode will restore them again later. That works for the simple round tripping case, but we currently lack good default tools for "cleaning" strings that may contain surrogates (or even scanning a string to see if surrogates are present).

One idea I had along those lines is a surrogatereplace error handler ( http://bugs.python.org/issue22016) that emitted an ASCII question mark for each smuggled byte, rather than propagating the encoding problem.

Cheers, Nick.

-Chris

-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov


Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20140821/742d453c/attachment-0001.html>



More information about the Python-Dev mailing list