[Python-Dev] PEP 383 update: utf8b is now the error handler (original) (raw)
M.-A. Lemburg mal at egenix.com
Thu May 7 11:21:28 CEST 2009
- Previous message: [Python-Dev] PEP 383 update: utf8b is now the error handler
- Next message: [Python-Dev] PEP 383 update: utf8b is now the error handler
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Antoine Pitrou wrote:
Martin v. Löwis <martin v.loewis.de> writes:
py> b'\xed\xa0\x80'.decode("utf-8","surrogates") '\ud800' The point is, "surrogates" does not mean anything intuitive for an /error handler/. You seem to be the only one who finds this name explicit enough, perhaps because you chose it. Most other handlers' names have verbs in them ("ignore", "replace", "xmlcharrefreplace", etc.).
Correct.
The purpose of an error handler name is to indicate to the user what it does, hence the use of verbs.
Walter started with "xmlcharrefreplace", ie. no space names, so "surrogatereplace" would be the logically correct name for the "replace with lone surrogates" scheme invented by Markus Kuhn.
The error handler for undoing this operation (ie. when converting a Unicode string to some other encoding) should probably use the same name based on symmetry and the fact that the escaping scheme is meant to be used for enabling round-trip safety.
BTW: It would also be appropriate to reference Markus Kuhn in the PEP as the inventor of the escaping scheme.
Even if only to give the reader an idea of how that scheme works and why (the PEP on python.org currently doesn't explain this).
It should also explain that the scheme is meant to assure round-trip safety and doesn't necessarily work when using transcoding, ie. reading using one encoding, writing using another.
Thanks,
Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, May 07 2009)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2009-06-29: EuroPython 2009, Birmingham, UK 52 days to go
::: Try our new mxODBC.Connect Python Database Interface for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
- Previous message: [Python-Dev] PEP 383 update: utf8b is now the error handler
- Next message: [Python-Dev] PEP 383 update: utf8b is now the error handler
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]