[Python-Dev] segfault (double free?) when '''-string crosses line (original) (raw)

Tim Peters tim.peters at gmail.com
Mon Apr 10 04:44:29 CEST 2006


[Guido]

On Linux, In HEAD 2.5, but only with the non-debug version, I get a segfault when I do this:

>>> ''' ... ''' It seems to occur for any triple-quoted string crossing a line boundary. A bit of the stack trace: #0 0x40030087 in pthreadmutexlock () from /lib/i686/libpthread.so.0 #1 0x4207ad18 in free () from /lib/i686/libc.so.6 #2 0x08057990 in toknextc (tok=0x81c71d8) at ../Parser/tokenizer.c:809 #3 0x0805872d in tokget (tok=0x81c71d8, pstart=0xbffff338, pend=0xbffff33c) at ../Parser/tokenizer.c:1411 #4 0x08059042 in PyTokenizerGet (tok=0x81c71d8, pstart=0xbffff338, pend=0xbffff33c) at ../Parser/tokenizer.c:1514 #5 0x080568a7 in parsetok (tok=0x81c71d8, g=0x814a000, start=256, errret=0xbffff3a0, flags=0) at ../Parser/parsetok.c:135

Does this ring a bell? Is there already an SF bug open perhaps? On OSX, I get an interesting error: python2.5(12998) malloc: *** Deallocation of a pointer not malloced: 0x36b460; This could be a double free(), or free() called with the middle of an allocated block; Try setting environment variable MallocHelp to see tools to help debug

It rings a bell here only in that the front end had lots of allocate-versus-free mismatches between the PyObject_ and PyMem_ raw-memory APIs, and this kind of failure smells a lot like that.

For example, the ../Parser/tokenizer.c:809 in the traceback is

            PyMem_FREE(new);

and one way to set new is from the earlier well-hidden

        else if (tok_stdin_decode(tok, &new) != 0)

where tok_stdin_decode() can do

PyMem_FREE(*inp);
*inp = converted;

where inp is its local name for new, and converted comes from

converted = new_string(PyString_AS_STRING(utf8),
               PyString_GET_SIZE(utf8));

and new_string() starts with

char* result = (char *)PyObject_MALLOC(len + 1);

So that's a mismatch, although I don't know whether it's the one that's triggering.

When I repaired all the mismatches that caused tests to crash on my box, I changed affected front-end string mucking to use PyObject_ uniformly (strings are usual small, the small-object allocator is usually faster than the platform malloc, and half (exactly half :-) of the crash-causing mismatched pairs were using PyObject_ anyway). Someone who understands their way through the sub-maze above is encouraged to do the same for it.

BTW, your "but only with the non-debug version" is more evidence: in a debug build, PyMem_ and PyObject_ calls are all redirected to Python's obmalloc, to take advantage of its debug-build padding gimmicks. It's only in a release build that PyMem_ resolves directly to the platform malloc/realloc/free.



More information about the Python-Dev mailing list