[Python-Dev] [regex] memory leak (original) (raw)
MRAB python at mrabarnett.plus.com
Sun Aug 2 17:54:22 CEST 2009
- Previous message: [Python-Dev] REVIEW: PyArg_ParseTuple with "s" format and NUL: Bogus TypeError detail string.
- Next message: [Python-Dev] standard library mimetypes module pathologically broken?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
John Machin wrote:
Hi Matthew,
Your post in c.l.py about your re rewrite didn't mention where to report bugs etc so I dug this address out of Google Groups ... Environment: Python 2.6.2, Windows XP SP3, your latest (29 July) regex from the Python bugtracker. Problem is repeated calls of e.g. compiledpattern.search(sometext) -- Task Manager performance panel shows increasing memory usage with regex but not with re. It appears to be cumulative i.e. changing to another pattern or text doesn't release memory. Example: 8<-- regextimer.py_ _import sys_ _import time_ _if sys.platform == 'win32':_ _timer = time.clock_ _else:_ _timer = time.time_ _module = _import_(sys.argv[1])_ _count = int(sys.argv[2])_ _pattern = sys.argv[3]_ _expected = sys.argv[4]_ _text = 80 * '
' + 'qwerty'_ rx = module.compile(pattern) t0 = timer() for i in xrange(count): assert rx.search(text).group(0) == expected t1 = timer() print "%d iterations in %.6f seconds" % (count, t1 - t0) _8<---_ _Here are the results of running this (plus observed difference between_ _peak memory usage and base memory usage):_ _dos-prompt>\python26\python regextimer.py regex 1000000 "" "~" 1000000 iterations in 3.811500 seconds [60 Mb] dos-prompt>\python26\python regextimer.py regex 2000000 "" "" 2000000 iterations in 7.581335 seconds [128 Mb] dos-prompt>\python26\python regextimer.py re 2000000 "" "" 2000000 iterations in 2.549738 seconds [3 Mb] This happens on a variety of patterns: "w", "wert", "[a-z]+", "[a-z]+t", ... Thanks for that, John. I've should've kept an eye on the Task Manager! :-) Now fixed.
It's surprising how much time and effort is needed just to manage the memory!
- Previous message: [Python-Dev] REVIEW: PyArg_ParseTuple with "s" format and NUL: Bogus TypeError detail string.
- Next message: [Python-Dev] standard library mimetypes module pathologically broken?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]