[Python-Dev] Re: Are we collecting benchmark results across machines (original) (raw)

Kurt B. Kaiser kbk at shore.net
Fri Jan 2 19:31:07 EST 2004

Previous message: [Python-Dev] Re: Are we collecting benchmark results across machines
Next message: [Python-Dev] Re: Are we collecting benchmark results across machines
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Guido van Rossum <guido at python.org> writes:

Hm... My IBM T40 with 1.4 GHz P(M) reports 15.608. I bet the caches are more similar, and affect performance more than CPU speed...

You have a 32K L1 and a 1024K (!) L2. What a great machine!

As an example of the other end of the spectrum, I'm running current, but low end hardware: a 2.2 GHz Celeron, 400 MHz FSB. 256MB DDR SDRAM. Verified working as expected by Intel Processor Speed Test.

My L1 is 12K trace, 8K data. My L2 is only 128K, and when there's a miss there's no L3 to fall back on. Cost for processor and mobo, new, $120.

I find this setup pretty snappy for what I do on it: development and home server. It's definitely not my game machine :-)

Python 2.2.1 (#1, Oct 17 2003, 16:36:36) [GCC 2.95.3 20010125 (prerelease, propolice)] on openbsd3 [best of 3]: Pystone(1.1) time for 10000 passes = 0.93 This machine benchmarks at 10752.7 pystones/second

Python 2.3.3 (#15, Jan 2 2004, 14:39:36) [best of 3]: Pystone(1.1) time for 50000 passes = 3.46 This machine benchmarks at 14450.9 pystones/second

Python 2.4a0 (#40, Jan 1 2004, 22:22:45) [current cvs] [best of 3]: Pystone(1.1) time for 50000 passes = 2.91 This machine benchmarks at 17182.1 pystones/second

(but see the p.s. below)

Now the parrotbench, version 1.04. [make extra passes to get .pyo first]

First, python 2.3.3: best 3: 31.1/31.8/32.3

Next, python 2.4a0, current cvs: best 3: 31.8/31.9/32.1

Since I noticed quite different ratios between the individual tests compared to what was posted by Seo Sanghyeon on the pypy list, here's my numbers (2.4a0):

hydra /home/kbk/PYTHON/python/nondist/sandbox/parrotbench$ make times for i in 0 1 2 3 4 5 6; do echo b$i.py; time /home/kbk/PYSRC/python b$i.py >@out$i; cmp @out$i out$i; done b0.py 5.48s real 5.30s user 0.05s system b1.py 1.36s real 1.22s user 0.10s system b2.py 0.44s real 0.42s user 0.04s system b3.py 2.01s real 1.94s user 0.04s system b4.py 1.69s real 1.63s user 0.05s system b5.py 4.80s real 4.73s user 0.02s system b6.py 1.84s real 1.56s user 0.26s system

I notice that some of these tests are a little faster on 2.3.3 while others are faster on 2.4, resulting in the overall time being about the same on both releases.

N.B. compiling Python w/o the stack protector doesn't make a noticeable difference ;-)

There may be some other problem with this box that I haven't yet discovered, but right now I'm blaming the tiny cache for performance being 2 - 3 x lower than expected from the clock rate, compared to what others are getting.

-- KBK

p.s. I saw quite a large outlier on 2.4 pystone when I first tried it. I didn't believe it, but was able to scroll back and clip it:

Python 2.4a0 (#40, Jan 1 2004, 22:22:45) [GCC 2.95.3 20010125 (prerelease, propolice)] on openbsd3 Type "help", "copyright", "credits" or "license" for more information.

from test.pystone import main main(); main(); main() Pystone(1.1) time for 50000 passes = 4.22 This machine benchmarks at 11848.3 pystones/second Pystone(1.1) time for 50000 passes = 4.21 This machine benchmarks at 11876.5 pystones/second Pystone(1.1) time for 50000 passes = 4.21 This machine benchmarks at 11876.5 pystones/second

This is 30% lower than the rate quoted above. I haven't been able to duplicate it. Maybe the OS or X was doing something which tied up the cache. This is a fairly lightly loaded machine running X, Ion, and emacs.

I've also seen 20% variations in the 2.2.1 pystone benchmark.

It seems to me that this benchmark is pretty cache sensitive and should be done on an unloaded system, preferable x/o X, and with the results averaged over many random trials if comparisions are desired, especially if the cache is small.

I don't see the same variation in the parrotbench. It's just consistently low for this box.

Previous message: [Python-Dev] Re: Are we collecting benchmark results across machines
Next message: [Python-Dev] Re: Are we collecting benchmark results across machines
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list