BUG: skiplist memory leak in rolling functions · Issue #43339 · pandas-dev/pandas (original) (raw)
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandas.
- (optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
running the test_rolling_non_monotonic
test with the following addition triggers the leak
discovered in #43338
diff --git a/pandas/tests/window/test_rolling.py b/pandas/tests/window/test_rolling.py
index f829ae4be0..35063ba555 100644
--- a/pandas/tests/window/test_rolling.py
+++ b/pandas/tests/window/test_rolling.py
@@ -1251,6 +1251,19 @@ def test_rolling_decreasing_indices(method):
-0.45439658241367054,
],
),
+ (
+ "median",
+ [
+ float("nan"),
+ 6.5,
+ float("nan"),
+ 20.5,
+ 4.0,
+ 6.5,
+ 9.0,
+ 12.5
+ ],
+ ),
],
)
def test_rolling_non_monotonic(method, expected):
Problem description
roll_median_c
and roll_quantile
reassign but don't destroy their skiplist pointers when the indices are non-monotonic. skiplist_destroy
should be called before skiplist_init
. It might also be worth considering adding a reset()
function to the skiplist to avoid unnecessary reallocations. But, since these extra allocations currently happen anyways it's probably better to just add the destroy() call first.
==184879==
==184879== 0 bytes in 1 blocks are indirectly lost in loss record 1 of 1,913
==184879== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==184879== by 0x168B8745: node_init (skiplist.h:69)
==184879== by 0x168B8745: skiplist_init(int) (skiplist.h:129)
==184879== by 0x168D3F1B: __pyx_pf_6pandas_5_libs_6window_12aggregations_10roll_median_c (aggregations.cpp:9332)
==184879== by 0x168D3F1B: __pyx_pw_6pandas_5_libs_6window_12aggregations_11roll_median_c(_object*, _object*, _object*) (aggregations.cpp:9050)
==184879== by 0x526878: ??? (in /usr/bin/python3.9)
==184879== by 0x629E37: _PyObject_Call (in /usr/bin/python3.9)
==184879== by 0x59F7D5: _PyEval_EvalFrameDefault (in /usr/bin/python3.9)
==184879== by 0x59734D: ??? (in /usr/bin/python3.9)
==184879== by 0x62C5F3: _PyFunction_Vectorcall (in /usr/bin/python3.9)
==184879== by 0x599266: _PyEval_EvalFrameDefault (in /usr/bin/python3.9)
==184879== by 0x59734D: ??? (in /usr/bin/python3.9)
==184879== by 0x62C5F3: _PyFunction_Vectorcall (in /usr/bin/python3.9)
==184879== by 0x599266: _PyEval_EvalFrameDefault (in /usr/bin/python3.9)
==184879==
==184879== 0 bytes in 1 blocks are indirectly lost in loss record 2 of 1,913
==184879== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==184879== by 0x168B8750: node_init (skiplist.h:70)
==184879== by 0x168B8750: skiplist_init(int) (skiplist.h:129)
==184879== by 0x168D3F1B: __pyx_pf_6pandas_5_libs_6window_12aggregations_10roll_median_c (aggregations.cpp:9332)
==184879== by 0x168D3F1B: __pyx_pw_6pandas_5_libs_6window_12aggregations_11roll_median_c(_object*, _object*, _object*) (aggregations.cpp:9050)
==184879== by 0x526878: ??? (in /usr/bin/python3.9)
==184879== by 0x629E37: _PyObject_Call (in /usr/bin/python3.9)
==184879== by 0x59F7D5: _PyEval_EvalFrameDefault (in /usr/bin/python3.9)
==184879== by 0x59734D: ??? (in /usr/bin/python3.9)
==184879== by 0x62C5F3: _PyFunction_Vectorcall (in /usr/bin/python3.9)
==184879== by 0x599266: _PyEval_EvalFrameDefault (in /usr/bin/python3.9)
==184879== by 0x59734D: ??? (in /usr/bin/python3.9)
==184879== by 0x62C5F3: _PyFunction_Vectorcall (in /usr/bin/python3.9)
==184879== by 0x599266: _PyEval_EvalFrameDefault (in /usr/bin/python3.9)
==184879==
Expected Output
Output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit : 4caa51b1790d3b1c03835e919fc9f753fbd817b3
python : 3.9.5.final.0
python-bits : 64
OS : Linux
OS-release : 5.11.0-7620-generic
Version : #21~1626191760~20.04~55de9c3~dev-Ubuntu SMP Tue Jul 20 18:02:09
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.4.0.dev0+551.g4caa51b179.dirty
numpy : 1.21.1
pytz : 2019.3
dateutil : 2.7.3
pip : 21.2.4
setuptools : 45.2.0
Cython : 0.29.21
pytest : 6.2.5
hypothesis : 6.17.4
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.6.3
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.1
IPython : 7.26.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.4.2
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.7.1
sqlalchemy : 1.4.22
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None```
</details>