msg51182 - (view) |
Author: Lars Skovlund (lskovlund) |
Date: 2006-10-02 13:30 |
Use iterative doubling when extending the old array. This results in O(log n) calls to memcpy() instead of O(n). |
|
|
msg51183 - (view) |
Author: Josiah Carlson (josiahcarlson) *  |
Date: 2006-10-07 16:39 |
Logged In: YES user_id=341410 Have you benchmarked this for repeats found "in the wild" to establish *observable* speedup for code that already exists? |
|
|
msg51184 - (view) |
Author: Lars Skovlund (lskovlund) |
Date: 2006-10-07 22:12 |
Logged In: YES user_id=263372 I wrote this code for a university project I'm doing myself, which involves initializing a *very* large array (it's a remote memory system). It does help there, certainly; for medium-sized arrays, the improvement would be negligable, and for small ones it might even be worse. If you mean, have I benchmarked it with other people's code, no. I just thought I'd offer it to the community, since it has certainly helped me. |
|
|
msg51185 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2006-10-08 00:59 |
Logged In: YES user_id=80475 This patch is basically fine. Will neaten it up to match our source coding conventions and apply it when I get a chance. Py2.6 won't be out for a good while, so there is no hurry. |
|
|
msg51186 - (view) |
Author: Larry Hastings (larry) *  |
Date: 2006-10-13 08:17 |
Logged In: YES user_id=364875 I'd assumed Python *didn't* double the size when resizing an array because of how much memory that wastes. May I suggest trying it with a multiplier of 1.5x, and comparing both CPU time and memory consumption? I find for these things that 1.5x is nearly as fast and wastes far less memory. |
|
|
msg51187 - (view) |
Author: Josiah Carlson (josiahcarlson) *  |
Date: 2006-10-13 15:26 |
Logged In: YES user_id=341410 This patch has nothing to do with array resizing. It has to do with array initialization. |
|
|
msg51188 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2007-04-03 02:50 |
This proposal is basically fine. The patch should be re-worked to model as closely as possible the code for string_repeat in Objects/stringobject.c (revisions 30616 and 30368 provide the details). That code is known to work and to have provided a meaningful speed-up. |
|
|
msg64405 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2008-03-24 09:02 |
This one is easy if someone wants to take a crack at it. |
|
|
msg64429 - (view) |
Author: Alexander Belopolsky (belopolsky) *  |
Date: 2008-03-24 18:42 |
Looking at stringobject.c:1034: i = 0; if (i < size) { Py_MEMCPY(op->ob_sval, a->ob_sval, Py_SIZE(a)); i = Py_SIZE(a); } while (i < size) { j = (i <= size-i) ? i : size-i; Py_MEMCPY(op->ob_sval+i, op->ob_sval, j); i += j; } return (PyObject *) op; Do I miss something or the first condition "if (i < size)" is a fancy way to check for size > 0? Wouldn't it be clearer to write if (size == 0) return (PyObject *) op; Py_MEMCPY(op->ob_sval, a->ob_sval, Py_SIZE(a)); i = Py_SIZE(a); .. |
|
|
msg64433 - (view) |
Author: Alexander Belopolsky (belopolsky) *  |
Date: 2008-03-24 19:33 |
Attached patch (.diff) reimplements the optimization by following Objects/stringobject.c logic as Raymond suggested. I see two remaining issues which in my view should be addressed separately: 1. There is no check for overflow after the multiplication computing the size of the resulting array. Note that while newarrayobject checks that size*itemsize does not overflow internally, there is no such check for Py_SIZE(a) * n. A decision needs to be made whether Overflow or NoMemory error should be raised because currently string_repeat and newarrayobject treat overflow differently. 2. See above. I think both string and (patched) array codes can be simplified. |
|
|
msg64435 - (view) |
Author: Alexander Belopolsky (belopolsky) *  |
Date: 2008-03-24 19:35 |
I forgot to mention that I added a unit test to cover the special case of repeating a single-byte array. |
|
|
msg68096 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2008-06-12 21:42 |
Georg, do you want to take it from here. |
|
|
msg110500 - (view) |
Author: Mark Lawrence (BreamoreBoy) * |
Date: 2010-07-16 21:20 |
There are lots of positive comments on this issue, please can core developers take a bit of time to have a look and say yes or no. |
|
|
msg110501 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2010-07-16 21:21 |
Uh, this slipped under my radar. Let's see if I get the chance to look at it at the core sprint next week. |
|
|
msg123334 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2010-12-04 11:02 |
I changed the patch to look more like unicode_repeat (which addresses Alex' point #2) and committed in r87022. |
|
|