Issue 13676: sqlite3: Zero byte truncates string contents (original) (raw)

Created on 2011-12-29 12:27 by petri.lehtinen, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
sqlite3_zero_byte.patch petri.lehtinen,2011-12-29 12:27 review
sqlite3_zero_byte_v2.patch petri.lehtinen,2011-12-30 09:38 Now with a test case! review
sqlite3_zero_byte_v3.patch petri.lehtinen,2012-01-01 20:21 review
Messages (8)
msg150329 - (view) Author: Petri Lehtinen (petri.lehtinen) * (Python committer) Date: 2011-12-29 12:27
Inserting a string with embedded zero byte only inserts the string up to the first zero byte: import sqlite3 connection = sqlite3.connect(':memory:') cursor = connection.cursor() cursor.execute('CREATE TABLE test (value TEXT)') cursor.execute('INSERT INTO test (value) VALUES (?)', ('foo\x00bar',)) cursor.execute('SELECT value FROM test') print(cursor.fetchone()) # expected output: (u'foo\x00bar',) # actual output: (u'foo',) Also, if there's already data inserted to a table like above with embedded zero bytes, the sqlite-API-to-Python-string conversion truncates the strings to just before the first zero byte. Attaching a patch against 3.3 that fixes the problem. Basically, it uses PyUnicode_AsStringAndSize and PyUnicode_FromStringAndSize instead of the non-size variants. Please review, as I'm not sure it covers each possible case.
msg150354 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-12-29 22:39
Where are the tests? :)
msg150367 - (view) Author: Petri Lehtinen (petri.lehtinen) * (Python committer) Date: 2011-12-30 09:38
What? Don't you SEE that it works correctly? :) Attached an updated patch with a test case. FTR, I also tried to make it possible to have the SQL statement include a zero byte, but it seems that sqlite3_prepare() (and also the newer sqlite3_prepare_v2()) always stops reading at the zero byte. See: http://www.sqlite.org/c3ref/prepare.html
msg150389 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-12-30 20:23
It would be nice to also have tests for the bytes and bytearray cases. It also seems the generic case hasn't been fixed ("""PyObject_CallFunction(self->connection->text_factory, "y", val_str)""").
msg150441 - (view) Author: Petri Lehtinen (petri.lehtinen) * (Python committer) Date: 2012-01-01 20:18
Attached an updated patch. The custom text_factory case is now fixed, and bytes, bytearray and custom factory are all tested. I also added back the pysqlite_unicode_from_string() function, as this makes the patch a bit smaller. It also seems to me (only by looking at the code) that the sqlite3.OptimizedUnicode factory isn't currently working as documented. Antoine: Do you happen to know what's the status of the OptimizeUnicode thingie? Has it been changed for a reason or is it just an error that happened during the py3k transition?
msg150442 - (view) Author: Petri Lehtinen (petri.lehtinen) * (Python committer) Date: 2012-01-01 20:21
(Whoops, I didn't mean to change the magic source coding comment. Updating the patch once again.)
msg150444 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-01-01 22:13
> Attached an updated patch. The custom text_factory case is now fixed, > and bytes, bytearray and custom factory are all tested. Thanks, looks good to me. > Antoine: Do you happen to know what's the status of the > OptimizeUnicode thingie? Has it been changed for a reason or is it > just an error that happened during the py3k transition? It looks obsolete in 3.x to me. If you look at the 2.7 source code, it had a real meaning there. Probably we could simplify the 3.x source code by removing that option (but better to do it in a separate patch).
msg152440 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-02-01 20:45
New changeset 2e13011b3719 by Petri Lehtinen in branch '3.2': sqlite3: Handle strings with embedded zeros correctly http://hg.python.org/cpython/rev/2e13011b3719 New changeset 93ac4b12a750 by Petri Lehtinen in branch '2.7': sqlite3: Handle strings with embedded zeros correctly http://hg.python.org/cpython/rev/93ac4b12a750 New changeset 6f4044afa600 by Petri Lehtinen in branch 'default': Merge branch 3.2 http://hg.python.org/cpython/rev/6f4044afa600
History
Date User Action Args
2022-04-11 14:57:25 admin set github: 57885
2012-02-01 20:45:08 python-dev set status: open -> closednosy: + python-devmessages: + resolution: fixedstage: resolved
2012-01-01 22:13:36 pitrou set messages: +
2012-01-01 20:21:37 petri.lehtinen set files: + sqlite3_zero_byte_v3.patchmessages: +
2012-01-01 20:21:17 petri.lehtinen set files: - sqlite3_zero_byte_v3.patch
2012-01-01 20🔞22 petri.lehtinen set files: + sqlite3_zero_byte_v3.patchmessages: +
2011-12-31 18:27:15 jcea set nosy: + jcea
2011-12-30 20:23:11 pitrou set messages: +
2011-12-30 09:38:04 petri.lehtinen set files: + sqlite3_zero_byte_v2.patchmessages: +
2011-12-29 22:39:50 pitrou set nosy: + pitroumessages: +
2011-12-29 12:27:34 petri.lehtinen create