[Python-Dev] Unicode exception indexing (original) (raw)

Antoine Pitrou solipsis at pitrou.net
Fri Nov 4 03🔞32 CET 2011

Previous message: [Python-Dev] Unicode exception indexing
Next message: [Python-Dev] Unicode exception indexing
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, 03 Nov 2011 22:47:00 +0100 "Martin v. Löwis" <martin at v.loewis.de> wrote:

>> On the one hand, these indices are used in formatting error messages such as >> "codec can't encode character \u%04x in position %d", suggesting they >> are regular >> indices into the string (counting code points). >> >> On the other hand, they are used by error handlers to lookup the character, >> and existing error handlers (including the ones we have now) use >> PyUnicodeAsUnicode to find the character. This suggests that the indices >> should be PyUNICODE indices, for compatibility (and they currently do >> work in this way). > > But what about error handlers written in Python?

I'm working on a patch where an C error handler using PyUnicodeEncodeErrorGetStart gets a different value than a Python error handler accessing .start. The GetStart/GetEnd functions would take the value from the exception object, and adjust it before returning it.

Is it worth the hassle? We can just port our existing error handlers, and I guess the few third-party error handlers written in C (if any) can bear the transition.

Regards

Antoine.

Previous message: [Python-Dev] Unicode exception indexing
Next message: [Python-Dev] Unicode exception indexing
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list