[Python-Dev] PEP 383 update: utf8b is now the error handler (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Thu May 7 07:43:30 CEST 2009

Previous message: [Python-Dev] PEP 383 update: utf8b is now the error handler
Next message: [Python-Dev] PEP 383 update: utf8b is now the error handler
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Michael Urman wrote:

On Wed, May 6, 2009 at 15:42, "Martin v. Löwis" <martin at v.loewis.de> wrote:

Despite there being also an error handler called "surrogates". Not that I have to be, but I'm not sold on the previous UTF-8 codec behavior becoming an error handler of the name "surrogates" for two reasons (I do respect the obvious PBP argument for the implementation, and have no better name - "lenient"?).

PBP?

First, unless there's a way to stack error handlers, there's no way to access the old behavior combined with the "replace" handler.

Well, there is a way to stack error handlers, although it's not pretty:

_surrogates = codecs.lookup_errors("surrogates") _replace = codecs.lookup_errors("replace") def surrogates_then_replace(exc): try: return _surrogates(exc) except UnicodeError: return _replace(exc) codecs.register_error("surrogates_then_replace", surrogates_then_replace)

The stacking argument also applies to the new utf8b behavior on encode (only, as it handles all errors on decode). This may be a YAGNI

Indeed - in particular, as, in the primary application of this error handler (i.e. file IO operations), there is no way of specifying an addition error handler anyway.

Regards, Martin

Previous message: [Python-Dev] PEP 383 update: utf8b is now the error handler
Next message: [Python-Dev] PEP 383 update: utf8b is now the error handler
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list