msg189641 - (view) |
Author: Florent Xicluna (flox) *  |
Date: 2013-05-20 08:21 |
I noticed the convenient ``html.escape`` in Python 3.2 and ``cgi.escape`` is marked as deprecated. However, the former is an order of magnitude slower than the latter. $ python3 --version Python 3.3.2 With html.escape: $ python3 -m timeit -s "from html import escape as html; from cgi import escape; s = repr(copyright)" "h = html(s)" 10000 loops, best of 3: 48.7 usec per loop $ python3 -m timeit -s "from html import escape as html; from cgi import escape; s = repr(copyright) * 19" "h = html(s)" 1000 loops, best of 3: 898 usec per loop With cgi.escape: $ python3 -m timeit -s "from html import escape as html; from cgi import escape; s = repr(copyright)" "h = escape(s)" 100000 loops, best of 3: 7.42 usec per loop $ python3 -m timeit -s "from html import escape as html; from cgi import escape; s = repr(copyright) * 19" "h = escape(s)" 10000 loops, best of 3: 21.5 usec per loop Since this kind of function is called frequently in template engines, it makes a difference. Of course C replacements are available on PyPI: MarkupSafe or Webext But it would be nice to restore the performance of cgi.escape with a pragmatic `.replace(` approach. |
|
|
msg189643 - (view) |
Author: Graham Dumpleton (grahamd) |
Date: 2013-05-20 08:53 |
Importing the cgi module the first time even in Python 2.X was always very expensive. I would suggest you redo the test using timing done inside of the script after modules have been imported so as to properly separate module import time in both cases from execution time of the specific function. |
|
|
msg189644 - (view) |
Author: Florent Xicluna (flox) *  |
Date: 2013-05-20 09:06 |
> I would suggest you redo the test using timing done inside of the script after modules have been imported. The -s switch takes care of this. |
|
|
msg189647 - (view) |
Author: Graham Dumpleton (grahamd) |
Date: 2013-05-20 10:14 |
Whoops. Missed the quoting. |
|
|
msg189711 - (view) |
Author: Matt Bryant (Teh Matt) * |
Date: 2013-05-20 23:05 |
I did a few more tests and am seeing the same speed differences Florent noticed. It seems reasonable to use .replace() instead, as it does the same thing significantly faster. I've attached a patch doing just this. |
|
|
msg190267 - (view) |
Author: A.M. Kuchling (akuchling) *  |
Date: 2013-05-29 02:30 |
Matt's patch looks good to me. It removes two module-level dicts, but they're marked as internal, so that's OK. There's already a test case that exercises html.escape(), so I don't think any additional tests are needed. |
|
|
msg192527 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2013-07-07 09:11 |
New changeset db5f2b74e369 by Ezio Melotti in branch 'default': #18020: improve html.escape speed by an order of magnitude. Patch by Matt Bryant. http://hg.python.org/cpython/rev/db5f2b74e369 |
|
|
msg192528 - (view) |
Author: Ezio Melotti (ezio.melotti) *  |
Date: 2013-07-07 09:12 |
Fixed, thanks for the report and the patch! |
|
|