[Python-Dev] bytes (original) (raw)
[Python-Dev] bytes / unicode
Nick Coghlan ncoghlan at gmail.com
Fri Jun 25 00:01:38 CEST 2010
- Previous message: [Python-Dev] bytes / unicode
- Next message: [Python-Dev] bytes / unicode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Fri, Jun 25, 2010 at 3:07 AM, P.J. Eby <pje at telecommunity.com> wrote:
(Btw, in some earlier emails, Stephen, you implied that this could be fixed with codecs -- but it can't, because the problem isn't with the bytes containing invalid Unicode, it's with the Unicode containing invalid bytes -- i.e., characters that can't be encoded to the ultimate codec target.)
That's what the surrogateescape error handler is for though - it will happily accept mojibake on input (putting invalid bytes into the PUA), and happily generate mojibake on output (recreating the invalid bytes from the PUA) as well.
Cheers, Nick.
-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
- Previous message: [Python-Dev] bytes / unicode
- Next message: [Python-Dev] bytes / unicode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]