[Python-Dev] Re: Regression in unicodestr.encode()? (original) (raw)

Tim Peters tim.one@comcast.net
Tue, 09 Apr 2002 20:45:02 -0400


Hm, but isn't there a way to encode a NUL that doesn't produce a NUL? In some variant?

UTF-8 has a "no \u0000 in, no NUL out" property by design (it's what makes UTF-8 uniquely well-suited to processing by crufty old 8-bit C string library routines, and that was a goal of the encoding scheme).

If people are really wondering whether Barry has discovered an actual bug, don't: take his example and decode it back to Unicode. You won't get what you started with in current CVS (or at least Barry didn't when I watched him do it). That's an easier proof than indirectly wondering about UTF-8 properties.