msg187527 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-04-21 21:34 |
b32encode accumulates encoded data in a bytes object and this operation has quadratic complexity. Here is a patch, which fixes this issue by accumulating in a list. |
|
|
msg187529 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-04-21 21:42 |
And here are other patch, which not only fixes an issue with quadratic complexity, but optimize b32encode and b32decode about 2.5 times. Microbenchmarks: ./python -m timeit -r 1 -n 10 -s "from base64 import b32encode as encode; data = open('python', 'rb').read(1000001)" "encode(data)" ./python -m timeit -r 1 -n 1 -s "from base64 import b32encode as encode, b32decode as decode; data = encode(open('python', 'rb').read(1000001))" "decode(data)" Results: First patch Second patch b32encode 1.25 sec 486 msec b32decode 2.08 sec 835 msec |
|
|
msg189538 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-05-18 18:45 |
Note that without patches both tests run many minutes. base32_fix.patch causes little (5%) performance degradation for small data (< KB), this is perhaps the cause for the use of string concatenation. However base32_optimize.patch speeds up about 3x for such data. |
|
|
msg189539 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-05-18 18:46 |
Note that without patches both tests run many minutes. base32_fix.patch causes little (5%) performance degradation for small data (< KB), this is perhaps the cause for the use of string concatenation. However base32_optimize.patch speeds up about 3x for such data. |
|
|
msg189544 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2013-05-18 19:53 |
Using a bytearray, rather than accumulating into a list, may reduce memory consumption. |
|
|
msg189551 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-05-18 21:12 |
Thank you Antoine. Here are updated patches. |
|
|
msg189554 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-05-18 22:14 |
I think 3.3 needs a fix. The quadratic complexity is a bug. |
|
|
msg189555 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2013-05-18 22:15 |
Then I would recommend committing the simple patch in 3.3 and the more optimized one in 3.4. Both look fine to me, althgugh I haven't really examined the padding stuff. |
|
|
msg189573 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2013-05-19 08:50 |
New changeset 4b5467d997f1 by Serhiy Storchaka in branch '3.3': Issue #17812: Fixed quadratic complexity of base64.b32encode(). http://hg.python.org/cpython/rev/4b5467d997f1 New changeset 1b5ef05d6ced by Serhiy Storchaka in branch 'default': Issue #17812: Fixed quadratic complexity of base64.b32encode(). http://hg.python.org/cpython/rev/1b5ef05d6ced |
|
|
msg189574 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-05-19 08:51 |
That is why I proposed two patches. |
|
|