[Python-Dev] Regression in unicodestr.encode()? (original) (raw)

Tim Peters tim.one@comcast.net
Thu, 11 Apr 2002 02:06:19 -0400


[Tim]

... If you run Barry's test under a debug build, a call to pymalloc's realloc complains immediately upon entry that the passed-in address suffered overwrites ...

This deserves emphasizing because the debug pymalloc is new: we've had two(!) memory corruption problems since this has been available, and the debug malloc was a real help both times.

This particular case was a best case: the debug realloc detected the corruption almost immediately after the overwrite occurred, and called Py_FatalError() after printing some helpful clues. But note that the "serial number" it printed was insane:

the block was made by call #1852047475 to debug malloc/realloc

That's because the overwrite was so bad it corrupted bytes beyond the end of the 4 trailing "forbidden bytes", and that's where the serial number is stored by the debug pymalloc. We'll all be much happier if you stick to modest off-by-1 fatal errors in the future <wink -- and note that it can catch off-by-1 on the nose: if you ask for 37 bytes, it can catch you writing into p[37] (alignment isn't an issue for this gimmick)>.

The other case was the gc-versus-trashcan disaster. The debug pymalloc didn't catch the corruption directly, but, when things blew up, it was dead obvious in the debugger that gc was crawling over an already-free()ed object (the object fields were entirely filled with pymalloc's "dead byte" value, 0xdb, which the debug pymalloc free() sprays into the released memory block).

So this is a powerful low-tech tool. If you want to become a wizard at it fast, deliberately provoke some object memory management errors in your source tree, and just play with what happens then in a debug build.