[Python-Dev] [Python-checkins] cpython: Issue #13165: stringbench is now available in the Tools/stringbench folder. (original) (raw)

Terry Reedy tjreedy at udel.edu
Mon Apr 9 20:54:03 CEST 2012


Some comments...

On 4/9/2012 11:09 AM, antoine.pitrou wrote:

http://hg.python.org/cpython/rev/704630a9c5d5 changeset: 76179:704630a9c5d5 user: Antoine Pitrou<solipsis at pitrou.net> date: Mon Apr 09 17:03:32 2012 +0200 summary: Issue #13165: stringbench is now available in the Tools/stringbench folder. ...

diff --git a/Tools/stringbench/stringbench.py b/Tools/stringbench/stringbench.py new file mode 100755 --- /dev/null +++ b/Tools/stringbench/stringbench.py @@ -0,0 +1,1483 @@ +

Did you mean to start with a blank line?

+# Various microbenchmarks comparing unicode and byte string performance +# Please keep this file both 2.x and 3.x compatible!

Which versions of 2.x? In particular

+dups = {}

+ dups[f.name] = 1

Is the use of a dict for a set a holdover that could be updated, or intentional for back compatibility with 2.whatever and before?

+# Try with regex + at usesre + at bench('s="ABC"*33; re.compile(s+"D").search((s+"D")*300+s+"E")', + "late match, 100 characters", 100) +def retestslowmatch100characters(STR): + m = STR("ABC"*33) + d = STR("D") + e = STR("E") + s1 = (m+d)*300 + m+e + s2 = m+e + pat = re.compile(s2) + search = pat.search + for x in RANGE100: + search(s1)

If regex is added to stdlib as other than re replacement, we might want option to use that instead or in addition to the current re.

+#### Benchmark join + +def getbytesyieldingseq(STR, arg): + if STR is BYTES and sys.versioninfo>= (3,): + raise UnsupportedType + return STR(arg)

+ at bench('"A".join("")', + "join empty string, with 1 character sep", 100)

I am puzzled by this. Does str.join(iterable) internally branch on whether the iterable is a str or not, so that that these timings might be different from equivalent timings with list of strings?

What might be interesting, especially for 3.3, is timing with non-ascii BMP and non-BMP chars both as joiner and joined.

tjr



More information about the Python-Dev mailing list