msg307609 - (view) |
Author: Brian Forst (bforst) * |
Date: 2017-12-05 00:40 |
We're moving some code from Python 2.7 to 3.6 and found a weird performance issue using SQLite in-memory and on-disk DBs with the built-in sqlite3 library. In Python 2.7, the two update statements below (excerpted from the attached file) run in the same amount of time. In Python 3.6 the update statement with the table name on a separate line runs 400x-600x slower with the example data provided in the file. """ UPDATE tbl SET col2 = NULL WHERE col1 = ? """ """ UPDATE tbl SET col2 = NULL WHERE col1 = ? """ We have verified this using Python installs from python.org on macOS Sierra and Windows 7 for Python 2.7 and 3.6. We have tried formatting the SQL strings in different ways and it appears that the speed change only occurs when the table name is on a different line than the "UPDATE". This also appears to be hitting some type of quadratic behaviour as with 10x less records, it only takes 10-15x as long. With the demo in the file we are seeing it take 1.6s on the fast string and ~1000s on the slow string. |
|
|
msg307611 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2017-12-05 01:11 |
I can confirm that there is a difference on linux as well, using the sqlite version for both 2.7 and 3.7: rdmurray@pydev:~/python/p27[2.7]>./python sqlite3_27_36_performance_bug.py First step: 3.22849011421 Second step: 3.2167429924 rdmurray@pydev:~/python/p37[master]>./python ../p27/sqlite3_27_36_performance_bug.py First step: 3.2722721099853516 Second step: 4.094221353530884 (I changed time.clock() to time.time()). |
|
|
msg307612 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2017-12-05 01:12 |
...using the *same* sqlite version... |
|
|
msg307683 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2017-12-05 20:57 |
Brian, does the speed difference disappear when you add a space character just after "UPDATE"? We may be hitting this path: https://github.com/python/cpython/blob/master/Modules/_sqlite/statement.c#L76-L93 |
|
|
msg307689 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2017-12-05 21:29 |
It disappears for me running it on linux with the blank added. |
|
|
msg307696 - (view) |
Author: Brian Forst (bforst) * |
Date: 2017-12-05 22:35 |
Hi Antoine, yup, adding a space after the UPDATE makes the speed difference disappear on macOS Sierra and Windows 7. |
|
|
msg322530 - (view) |
Author: Berker Peksag (berker.peksag) *  |
Date: 2018-07-28 09:14 |
https://github.com/python/cpython/commit/ab994ed8b97e1b0dac151ec827c857f5e7277565 wasn't merged in the 2.7 branch, so this should only be reproduced in Python 3.6+. |
|
|
msg325854 - (view) |
Author: Berker Peksag (berker.peksag) *  |
Date: 2018-09-20 11:10 |
New changeset 8d1e190fc507a9e304f6817e761e9f628a23cbd8 by Berker Peksag in branch 'master': bpo-32215: Fix performance regression in sqlite3 (GH-8511) https://github.com/python/cpython/commit/8d1e190fc507a9e304f6817e761e9f628a23cbd8 |
|
|
msg325892 - (view) |
Author: Berker Peksag (berker.peksag) *  |
Date: 2018-09-20 15:57 |
New changeset 015cd0f5cb17b1b208a92e549cd665dc38f2f699 by Berker Peksag (Miss Islington (bot)) in branch '3.7': bpo-32215: Fix performance regression in sqlite3 (GH-8511) https://github.com/python/cpython/commit/015cd0f5cb17b1b208a92e549cd665dc38f2f699 |
|
|
msg325912 - (view) |
Author: Berker Peksag (berker.peksag) *  |
Date: 2018-09-20 17:19 |
New changeset 4fb672ff96ecbb87aaf2ecc4f04aed76aafe63b1 by Berker Peksag (Miss Islington (bot)) in branch '3.6': bpo-32215: Fix performance regression in sqlite3 (GH-8511) https://github.com/python/cpython/commit/4fb672ff96ecbb87aaf2ecc4f04aed76aafe63b1 |
|
|
msg378756 - (view) |
Author: Irit Katriel (iritkatriel) *  |
Date: 2020-10-16 21:50 |
This seems complete, can it be closed? |
|
|