BUG: hashtable memory error causes test_factorize_nan crash by mikelkelle · Pull Request #7157 · pandas-dev/pandas (original) (raw)

ObjectVector class resizes its array without reseting its capacity count, so subsequent appends are invalid.

Mac OS 10.9, Python 2.7.6, numpy 1.9.0.dev-ee49411.

==57654== Invalid write of size 8
==57654==    at 0x137F856: __pyx_f_6pandas_9hashtable_12ObjectVector_append (in /Users/mtk/Projects/pandas/pandas/hashtable.so)
==57654==    by 0x139B16F: __pyx_pw_6pandas_9hashtable_17PyObjectHashTable_25get_labels (in /Users/mtk/Projects/pandas/pandas/hashtable.so)
==57654==    by 0x138CA9E: __pyx_pw_6pandas_9hashtable_10Factorizer_5factorize (in /Users/mtk/Projects/pandas/pandas/hashtable.so)
==57654==    by 0xD227E: PyEval_EvalFrameEx (in /usr/local/anaconda/lib/libpython2.7.dylib)

==57654==  Address 0x10095efd0 is 16 bytes inside a block of size 256 free'd
==57654==    at 0x7858: realloc (in /usr/local/Cellar/valgrind/3.9.0/lib/valgrind/vgpreload_memcheck-amd64-darwin.so)
==57654==    by 0x13C0F55: PyDataMem_RENEW (in /usr/local/anaconda/lib/python2.7/site-packages/numpy/core/multiarray.so)
==57654==    by 0x1488DE7: PyArray_Resize (in /usr/local/anaconda/lib/python2.7/site-packages/numpy/core/multiarray.so)
==57654==    by 0x14647A0: array_resize (in /usr/local/anaconda/lib/python2.7/site-packages/numpy/core/multiarray.so)
==57654==    by 0x1396A12: __pyx_pw_6pandas_9hashtable_12ObjectVector_5to_array (in /Users/mtk/Projects/pandas/pandas/hashtable.so)
==57654==    by 0x138CFEE: __pyx_pw_6pandas_9hashtable_10Factorizer_5factorize (in /Users/mtk/Projects/pandas/pandas/hashtable.so)
==57654==    by 0xD227E: PyEval_EvalFrameEx (in /usr/local/anaconda/lib/libpython2.7.dylib)