[Python-Dev] Revising RE docs (original) (raw)

Guido van Rossum guido at python.org
Sun Sep 4 05:10:31 CEST 2005


On 9/2/05, Gareth McCaughan <gmccaughan at synaptics-uk.com> wrote:

On Thursday 2005-09-01 18:09, Guido van Rossum wrote:

> They are cached and there is no cost to using the functions instead > of the methods unless you have so many regexps in your program that > the cache is cleared (the limit is 100). Sure there is; the cost of looking them up in the cache. >>> import re,timeit >>> timeit.re=re >>> timeit.Timer("""re.search(r"(\d*).(\d)", "abc123def456")""").timeit(1000000) 7.6042091846466064 >>> timeit.r = re.compile(r"(\d*).(\d)") >>> timeit.Timer("""r.search("abc123def456")""").timeit(1000000) 2.6358869075775146 >>> timeit.Timer().timeit(1000000) 0.091850996017456055 So in this (highly artificial toy) application it's about 7.5/2.5 = 3 times faster to use the methods instead of the functions.

Yeah, but the cost is a constant -- it is not related to the cost of compiling the re. (You should've shown how much it cost if you included the compilation in each search.)

I haven't looked into this, but I bet the overhead you're measuring is actually the extra Python function call, not the cache lookup itself. I also notice that _compile() is needlessly written as a varargs function -- all its uses pass it exactly two arguments.

-- --Guido van Rossum (home page: http://www.python.org/~guido/)



More information about the Python-Dev mailing list