Issue 32215: sqlite3 400x-600x slower depending on formatting of an UPDATE statement in a string (original) (raw)

Created on 2017-12-05 00:40 by bforst, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
sqlite3_27_36_performance_bug.py bforst,2017-12-05 00:40 Demo of the bug
Pull Requests
URL Status Linked Edit
PR 8511 merged berker.peksag,2018-07-28 09:10
PR 9441 closed miss-islington,2018-09-20 11:11
PR 9442 closed miss-islington,2018-09-20 11:11
PR 9449 merged miss-islington,2018-09-20 15:25
PR 9452 merged miss-islington,2018-09-20 15:57
Messages (11)
msg307609 - (view) Author: Brian Forst (bforst) * Date: 2017-12-05 00:40
We're moving some code from Python 2.7 to 3.6 and found a weird performance issue using SQLite in-memory and on-disk DBs with the built-in sqlite3 library. In Python 2.7, the two update statements below (excerpted from the attached file) run in the same amount of time. In Python 3.6 the update statement with the table name on a separate line runs 400x-600x slower with the example data provided in the file. """ UPDATE tbl SET col2 = NULL WHERE col1 = ? """ """ UPDATE tbl SET col2 = NULL WHERE col1 = ? """ We have verified this using Python installs from python.org on macOS Sierra and Windows 7 for Python 2.7 and 3.6. We have tried formatting the SQL strings in different ways and it appears that the speed change only occurs when the table name is on a different line than the "UPDATE". This also appears to be hitting some type of quadratic behaviour as with 10x less records, it only takes 10-15x as long. With the demo in the file we are seeing it take 1.6s on the fast string and ~1000s on the slow string.
msg307611 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-12-05 01:11
I can confirm that there is a difference on linux as well, using the sqlite version for both 2.7 and 3.7: rdmurray@pydev:~/python/p27[2.7]>./python sqlite3_27_36_performance_bug.py First step: 3.22849011421 Second step: 3.2167429924 rdmurray@pydev:~/python/p37[master]>./python ../p27/sqlite3_27_36_performance_bug.py First step: 3.2722721099853516 Second step: 4.094221353530884 (I changed time.clock() to time.time()).
msg307612 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-12-05 01:12
...using the *same* sqlite version...
msg307683 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-12-05 20:57
Brian, does the speed difference disappear when you add a space character just after "UPDATE"? We may be hitting this path: https://github.com/python/cpython/blob/master/Modules/_sqlite/statement.c#L76-L93
msg307689 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-12-05 21:29
It disappears for me running it on linux with the blank added.
msg307696 - (view) Author: Brian Forst (bforst) * Date: 2017-12-05 22:35
Hi Antoine, yup, adding a space after the UPDATE makes the speed difference disappear on macOS Sierra and Windows 7.
msg322530 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2018-07-28 09:14
https://github.com/python/cpython/commit/ab994ed8b97e1b0dac151ec827c857f5e7277565 wasn't merged in the 2.7 branch, so this should only be reproduced in Python 3.6+.
msg325854 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2018-09-20 11:10
New changeset 8d1e190fc507a9e304f6817e761e9f628a23cbd8 by Berker Peksag in branch 'master': bpo-32215: Fix performance regression in sqlite3 (GH-8511) https://github.com/python/cpython/commit/8d1e190fc507a9e304f6817e761e9f628a23cbd8
msg325892 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2018-09-20 15:57
New changeset 015cd0f5cb17b1b208a92e549cd665dc38f2f699 by Berker Peksag (Miss Islington (bot)) in branch '3.7': bpo-32215: Fix performance regression in sqlite3 (GH-8511) https://github.com/python/cpython/commit/015cd0f5cb17b1b208a92e549cd665dc38f2f699
msg325912 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2018-09-20 17:19
New changeset 4fb672ff96ecbb87aaf2ecc4f04aed76aafe63b1 by Berker Peksag (Miss Islington (bot)) in branch '3.6': bpo-32215: Fix performance regression in sqlite3 (GH-8511) https://github.com/python/cpython/commit/4fb672ff96ecbb87aaf2ecc4f04aed76aafe63b1
msg378756 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2020-10-16 21:50
This seems complete, can it be closed?
History
Date User Action Args
2022-04-11 14:58:55 admin set github: 76396
2021-06-25 22:41:09 iritkatriel set status: open -> closedresolution: fixedstage: patch review -> resolved
2020-10-16 21:50:09 iritkatriel set nosy: + iritkatrielmessages: +
2018-09-20 17:19:53 berker.peksag set messages: +
2018-09-20 15:57:46 miss-islington set pull_requests: + <pull%5Frequest8867>
2018-09-20 15:57:01 berker.peksag set messages: +
2018-09-20 15:25:53 miss-islington set pull_requests: + <pull%5Frequest8864>
2018-09-20 11:11:17 miss-islington set pull_requests: + <pull%5Frequest8857>
2018-09-20 11:11:10 miss-islington set pull_requests: + <pull%5Frequest8856>
2018-09-20 11:10:54 berker.peksag set messages: +
2018-07-28 09:14:36 berker.peksag set messages: + components: - Interpreter Coreversions: + Python 3.8, - Python 2.7
2018-07-28 09:10:28 berker.peksag set keywords: + patchstage: patch reviewpull_requests: + <pull%5Frequest8028>
2017-12-06 05:00:46 berker.peksag set nosy: + berker.peksag
2017-12-05 22:35:54 bforst set messages: +
2017-12-05 21:29:13 r.david.murray set messages: +
2017-12-05 20:57:02 pitrou set nosy: + pitroumessages: +
2017-12-05 01:12:58 r.david.murray set versions: + Python 3.7
2017-12-05 01:12:42 r.david.murray set messages: +
2017-12-05 01:11:56 r.david.murray set nosy: + r.david.murraymessages: +
2017-12-05 00:40:52 bforst create