[Python-Dev] Python-3.0, unicode, and os.environ (original) (raw)
M.-A. Lemburg mal at egenix.com
Mon Dec 8 22:44:30 CET 2008
- Previous message: [Python-Dev] Python-3.0, unicode, and os.environ
- Next message: [Python-Dev] Python-3.0, unicode, and os.environ
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 2008-12-08 22:32, Adam Olsen wrote:
On Mon, Dec 8, 2008 at 2:01 PM, M.-A. Lemburg <mal at egenix.com> wrote:
On 2008-12-08 21:45, Antoine Pitrou wrote:
M.-A. Lemburg <mal egenix.com> writes:
Such application specific error handlers could then also apply whatever fancy round-trip safe encoding of non-decodable bytes to Unicode escapes, private code points, etc. as seen fit by the application. I'd argue that such fancy round-trip safe error handler should be provided by Python. It's not reasonable to expect application coders to come up with their own codec variation based on subtle details of the unicode spec. Fair enough. We could add some e.g. * a round-trip safe escape error handler that uses a Unicode private code point area which we officially reserve for the Python interpreter This would of course alter the behaviour of those private code points, preventing them from round-tripping properly. I don't think round-tripping can be done from an error handler. You need a full codec to do it. A simple option is 8859-1. Or, ya know, bytes. This has long since gotten repetitive..
The error handler would just map the problem bytes to the private area. The application would then have to decide what to do with them, ie. the error handler only provides one half of the round- tripping.
And that's on purpose: I don't believe we can come up with some magic solution for the encodings problem. This is essentially something that applications will have to solve on a case-by-case basis.
* a human readable escape error handler that encodes the problem bytes to say hex escapes, e.g. gives Andr\xe9 for a Latin-1 encoded directory name instead of failing Similar to 'รถ'.encode('ascii', 'backslashreplace')? I'm +1 on making that work.
Yes.
* a warning error handler that replaces the problem cases with a question mark and issues a warning through the warning framework I dub thee errors='warnreplace'.
Yep, something along those lines.
Perhaps there are more and better alternatives. These suggestions are just to show how the idea could be put to some real-life use.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, Dec 08 2008)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2008-12-02: Released mxODBC.Connect 1.0.0 http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
- Previous message: [Python-Dev] Python-3.0, unicode, and os.environ
- Next message: [Python-Dev] Python-3.0, unicode, and os.environ
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]