msg96984 - (view) |
Author: A.M. Kuchling (akuchling) *  |
Date: 2009-12-29 02:11 |
Part of Unladen Swallow's roadmap is to use a threaded-interpreter technique for the regular expression engine. That sounded like an interesting idea, so I went ahead and tried to implement it. The current patch is attached. To try it: run configure --with-computed-gotos; apply the patch; and compile Python. Still to do: * Benchmarking, to determine if it's actually an improvement. * Possibly integrate construction of the Modules/re_opcodes.h file into the build process. The current file is supplied; to rebuild it, run "python Modules/makereopcodes.py > Modules/re_opcodes.h". (But it only needs rebuilding if you add new regex opcodes, which is rarely done -- maybe it's sufficient to include a reminder to update it plus the script). |
|
|
msg97038 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2009-12-30 16:36 |
On the principle nothing looks wrong. There are some tabs-vs-spaces issues in _sre.c. Can some out-of-bounds crash be triggered if an opcode is greater than 33? It needs some benchmarks to know whether it's efficient. Also, I think it would be nice to include regeneration of the opcode table in the build process. Since it's very light-weight it can be done unconditionally (I suppose it will be done from setup.py, right?). |
|
|
msg97097 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2009-12-31 14:54 |
_sre is listed in Modules/Setup, so it will be a built-in module by default. |
|
|
msg99471 - (view) |
Author: A.M. Kuchling (akuchling) *  |
Date: 2010-02-17 14:36 |
I finally got around to benchmarking this change, and unfortunately the results are not good. I used the regex tests in the Unladen Swallow test suite, regex_effbot and regex_v8. The tests are written for Python 2.x, but the fixes for 3.x are straightforward (use print() in one function; replace xrange with range in the bm_regex_effbot.py and bm_regex_v8.py files; remove a few uses of u''; I'll provide a patch later.) Hardware: MacBook, Intel Core 2 Duo, 1.83GHz, 2MB L2 cache, 667 MHz bus. Tests invoked with ./perf.py -b regex_effbot -r -v ../py3k/python.exe ../threaded-3000/python.exe regex_effbot is 1.1002 times slower with the computed-goto patch, and regex_v8 is 1.0081x slower. perf.py indicates that both results are significant. I'd like to see a few people replicate these results -- maybe the effect is very platform dependent or I ran the tests incorrectly -- but on current evidence, this patch is not worth pursuing. |
|
|
msg99473 - (view) |
Author: A.M. Kuchling (akuchling) *  |
Date: 2010-02-17 15:13 |
Actually, I really want someone to verify that measurement. As a control, I tried running the call_method benchmark (after a few more xrange fixes). The Python 3.x trunk version with my patch is measured as 1.0227x slower, even though the patch only touches the re module and call_method doesn't use the module at all. I recompiled both binaries; both builds are using the same compiler arguments; both have the same version from trunk. I'm mystified about why the patched version is slower. |
|
|
msg99478 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2010-02-17 18:50 |
You should disassemble the output (or produce assembler from gcc) and check that the various indirect jumps at the end of each case block don't get merged into a single shared indirect jump. Or perhaps it's simply that regular expression matching isn't really sensitive to bytecode dispatch overhead. |
|
|