msg150329 - (view) |
Author: Petri Lehtinen (petri.lehtinen) *  |
Date: 2011-12-29 12:27 |
Inserting a string with embedded zero byte only inserts the string up to the first zero byte: import sqlite3 connection = sqlite3.connect(':memory:') cursor = connection.cursor() cursor.execute('CREATE TABLE test (value TEXT)') cursor.execute('INSERT INTO test (value) VALUES (?)', ('foo\x00bar',)) cursor.execute('SELECT value FROM test') print(cursor.fetchone()) # expected output: (u'foo\x00bar',) # actual output: (u'foo',) Also, if there's already data inserted to a table like above with embedded zero bytes, the sqlite-API-to-Python-string conversion truncates the strings to just before the first zero byte. Attaching a patch against 3.3 that fixes the problem. Basically, it uses PyUnicode_AsStringAndSize and PyUnicode_FromStringAndSize instead of the non-size variants. Please review, as I'm not sure it covers each possible case. |
|
|
msg150354 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2011-12-29 22:39 |
Where are the tests? :) |
|
|
msg150367 - (view) |
Author: Petri Lehtinen (petri.lehtinen) *  |
Date: 2011-12-30 09:38 |
What? Don't you SEE that it works correctly? :) Attached an updated patch with a test case. FTR, I also tried to make it possible to have the SQL statement include a zero byte, but it seems that sqlite3_prepare() (and also the newer sqlite3_prepare_v2()) always stops reading at the zero byte. See: http://www.sqlite.org/c3ref/prepare.html |
|
|
msg150389 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2011-12-30 20:23 |
It would be nice to also have tests for the bytes and bytearray cases. It also seems the generic case hasn't been fixed ("""PyObject_CallFunction(self->connection->text_factory, "y", val_str)"""). |
|
|
msg150441 - (view) |
Author: Petri Lehtinen (petri.lehtinen) *  |
Date: 2012-01-01 20:18 |
Attached an updated patch. The custom text_factory case is now fixed, and bytes, bytearray and custom factory are all tested. I also added back the pysqlite_unicode_from_string() function, as this makes the patch a bit smaller. It also seems to me (only by looking at the code) that the sqlite3.OptimizedUnicode factory isn't currently working as documented. Antoine: Do you happen to know what's the status of the OptimizeUnicode thingie? Has it been changed for a reason or is it just an error that happened during the py3k transition? |
|
|
msg150442 - (view) |
Author: Petri Lehtinen (petri.lehtinen) *  |
Date: 2012-01-01 20:21 |
(Whoops, I didn't mean to change the magic source coding comment. Updating the patch once again.) |
|
|
msg150444 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-01-01 22:13 |
> Attached an updated patch. The custom text_factory case is now fixed, > and bytes, bytearray and custom factory are all tested. Thanks, looks good to me. > Antoine: Do you happen to know what's the status of the > OptimizeUnicode thingie? Has it been changed for a reason or is it > just an error that happened during the py3k transition? It looks obsolete in 3.x to me. If you look at the 2.7 source code, it had a real meaning there. Probably we could simplify the 3.x source code by removing that option (but better to do it in a separate patch). |
|
|
msg152440 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-02-01 20:45 |
New changeset 2e13011b3719 by Petri Lehtinen in branch '3.2': sqlite3: Handle strings with embedded zeros correctly http://hg.python.org/cpython/rev/2e13011b3719 New changeset 93ac4b12a750 by Petri Lehtinen in branch '2.7': sqlite3: Handle strings with embedded zeros correctly http://hg.python.org/cpython/rev/93ac4b12a750 New changeset 6f4044afa600 by Petri Lehtinen in branch 'default': Merge branch 3.2 http://hg.python.org/cpython/rev/6f4044afa600 |
|
|