[Python-Dev] Re: [Python-checkins]python/dist/src/Objects unicodeobject.c, 2.197, 2.198 (original) (raw)

M.-A. Lemburg mal at lemburg.com
Mon Sep 22 15:13:26 EDT 2003


Tim Peters wrote:

[Tim]

At the moment, it appears there's no identified reason to care about signedness of a greater-than 16-bit type, [M.-A. Lemburg] Sure there is: first of all, having a single type that can be signed on some platforms and unsigned on others is a bad thing per se We inherit that from C, though -- it's fine by C if wchart is signed or unsigned, just as it refused to define the signedness of char.

It maybe fine for C... it is not for the Unicode implementation since that has always assumed Py_UNICODE to be unsigned. This is fixed now.

and second the 32-bit signed wchart value was what triggered this thread in the first place. What triggered the thread originally was a segfault due to the code making a branch based on the content of uninitialized memory. The code clearly didn't think it was reading up random heap bits, so that was a bug regardless of wchart's signedness.

True, but the test (unicode->str[0] < 256) is what revealed a second bug and that's what we've been discussing all along.

That wchart happened to be a signed 32-bit type on Jeremy's box is what uncovered the read-uninitialized-memory bug.

If there's no other code vulnerable to bad behavior if wchart is a signed 32-bit type (nobody has identified another case), objections to it being signed anyway seem technically groundless.

There are more comparisons of the above type in the code and even worse: it is documented that Py_UNICODE is unsigned, so it's very likely that code external to the Python distribution such as codec packages or applications talking to libraries use that assumption as well.

Martin did give a technical reason (efficiency) for wanting to continue to use wchart on Jeremy's system.

Python won't be using wchar_t on those systems anymore, so the problem is solved and the original intent restored. If efficiency matters programmers are always free to cast Py_UNICODE to wchar_t on these systems for fast read-only access.

-- Marc-Andre Lemburg eGenix.com

Professional Python Software directly from the Source (#1, Sep 22 2003)

Python/Zope Products & Consulting ... http://www.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::



More information about the Python-Dev mailing list