[Python-Dev] PEP 393 review (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Sun Aug 28 21:47:05 CEST 2011

Previous message: [Python-Dev] PEP 393 review
Next message: [Python-Dev] PEP 393 review
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

I would say no more than a 15% slowdown on each of the following benchmarks:

- stringbench.py -u (http://svn.python.org/view/sandbox/trunk/stringbench/) - iobench.py -t (in Tools/iobench/) - the jsondump, jsonload and regexv8 tests from http://hg.python.org/benchmarks/

I now have benchmark results for these; numbers are for revision c10bcab2aac7, comparing to 1ea72da11724 (wide unicode), on 64-bit Linux with gcc 4.6.1 running on Core i7 2.8GHz.

stringbench gives 10% slowdown on total time; the tests take between 78% and 220%. The cost is typically not in performing the string operations themselves, but in the creation of the result strings. In PEP 393, a buffer must be scanned for the highest code point, which means that each byte must be inspected twice (a second time when the copying occurs).
the iobench results are between 2% acceleration (seek operations), 16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and 37% for large sized reads (154 MB/s vs. 235 MB/s). The speed difference is probably in the UTF-8 decoder; I have already restored the "runs of ASCII" optimization and am out of ideas for further speedups. Again, having to scan the UTF-8 string twice is probably one cause of slowdown.
the json and regex_v8 tests see a slowdown of below 1%.

The slowdown is larger when compared with a narrow Unicode build.

Additionally, it would be nice if you could run at least some of the testbigmem tests, according to your system's available RAM.

Running only StrTest with 4.5G allows me to run 2 tests (test_encode_raw_unicode_escape and test_encode_utf7); this sees a slowdown of 37% in Linux user time.

Regards, Martin

Previous message: [Python-Dev] PEP 393 review
Next message: [Python-Dev] PEP 393 review
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list