Issue 640230: Discovered typo in zlib test. (original) (raw)

Created on 2002-11-18 18:36 by scott_daniels, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Messages (9)
msg13357 - (view) Author: Scott David Daniels (scott_daniels) * Date: 2002-11-18 18:36
In test_zlib.py (while chasing what appears to be a documentation problem), I found a flaw in the code that tests max_length limitted output from a decompression object test. Starting at line 100, the existing code is: ... bufs.append(deco.flush()) decomp2 = ''.join(buf) if decomp2 != buf: print "max_length decompressobj failed" else: print "max_length decompressobj succeeded" ... This test will always succeed (''.join(str) == str). What seems obviously meant is: ... bufs.append(deco.flush()) decomp2 = ''.join(bufs) if decomp2 != buf: print "max_length decompressobj failed" else: print "max_length decompressobj succeeded" ...
msg13358 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2002-11-18 20:56
Logged In: YES user_id=31435 Good eye! Unfortunately, when I repair that, the test fails here -- decomp2 is a proper prefix of buf then. It's "too short". Raised priority but left unassigned (someone who understands zlib better will have to jump in).
msg13359 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2002-11-19 09:50
Logged In: YES user_id=6656 Assuming Tim meant what he said, not what he did: raising priority.
msg13360 - (view) Author: Scott David Daniels (scott_daniels) * Date: 2002-11-19 21:40
Logged In: YES user_id=493818 OK, The reason the code fails after the fix is as follows: deco.flush() does not return any output. The loop control says, "while we have more unexamined source." However, the decompressor can consume all of the input before it has provided all of its output. So, there are two possible fixes: 1) Minimal change: Change the loop control to say, in effect, "While we have more input or are producing output.": Around line 92: Change: bufs = [] while cb: max_length = 1 + len(cb)/10 To: bufs = [] while cb or chunk: max_length = 1 + len(cb)/10 2) More reasonable(?) change: After the insertion loop, (just before the flush) extract all remaining output: Around line 99: cb = deco.unconsumed_tail bufs.append(deco.flush()) decomp2 = ''.join(bufs) Becomes: cb = deco.unconsumed_tail while bufs[-1]: bufs.append(deco.decompress('', max_length)) bufs.append(deco.flush()) decomp2 = ''.join(bufs) --- Note, in any case, the bufs.append(deco.flush()) originally on line 100 _always_ appends a zero-length string. I would suggest changing it to simply: deco.flush()
msg13361 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2002-11-19 22:08
Logged In: YES user_id=31435 I'd say it's a bug in flush(), or in the docs, then. They say "all pending input is processed, and a string containing the remaining uncompressed output is returned", and that's not happening here. That's pretty clearly what the author of this test *expected* to happen, too (the original bug you discovered looked much more like a typo than a failure to understand the module interface).
msg13362 - (view) Author: Scott David Daniels (scott_daniels) * Date: 2002-12-12 18:32
Logged In: YES user_id=493818 The following replacement for PyZlib_unflush addresses some of the problem, but leaves unaddressed the issues of what unused_data and unconsumed_tail should be at the end of the routine. static PyObject * PyZlib_unflush(compobject *self, PyObject *args) { int err, length = DEFAULTALLOC; PyObject * retval = NULL; unsigned long start_total_out; if (!PyArg_ParseTuple(args, "|i:flush", &length)) return NULL; if (!(retval = PyString_FromStringAndSize(NULL, length))) return NULL; ENTER_ZLIB start_total_out = self->zst.total_out; self->zst.avail_out = length; self->zst.next_out = (Byte *)PyString_AS_STRING(retval); Py_BEGIN_ALLOW_THREADS err = inflate(&(self->zst), Z_FINISH); Py_END_ALLOW_THREADS /* while Z_OK and the output buffer is full, there might be more output, so extend the output buffer and try again */ while ((err == Z_OK err == Z_BUF_ERROR) && self->zst.avail_out == 0) { if (_PyString_Resize(&retval, length << 1) < 0) goto error; self->zst.next_out = (Byte *)PyString_AS_STRING(retval) + length; self->zst.avail_out = length; length = length << 1; Py_BEGIN_ALLOW_THREADS err = inflate(&(self->zst), Z_FINISH); Py_END_ALLOW_THREADS } /* Maybe _always_ call inflateEnd if clearing is_initialized */ if (err == Z_STREAM_END) { err = inflateEnd(&(self->zst)); if (err != Z_OK) { zlib_error(self->zst, err, "from inflateEnd()"); Py_DECREF(retval); retval = NULL; goto error; } } self->is_initialised = 0; _PyString_Resize(&retval, self->zst.total_out - start_total_out); /* ??? Here fix unused_data and unconsumed_tail */ error: LEAVE_ZLIB return retval; }
msg13363 - (view) Author: Scott David Daniels (scott_daniels) * Date: 2003-02-01 03:33
Logged In: YES user_id=493818 I've attached a PyUnit-style replacement for test_zlib.py that tests this condition to Patch #678531, which provides a context diff updating zlibmodule.c to fix this problem. Hope this is enough to get the fix reviewed.
msg13364 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2003-02-03 03:22
Logged In: YES user_id=31435 It looks like the problem with getting this reviewed is that no current developer understands the zlib code or its test. I bumped the priority in a sneaky attempt to attract more interest . *If* nobody else bites soon, and I can make time, I'll assign it to myself. Your efforts are appreciated! Even if it doesn't seem like it.
msg13365 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2003-02-03 20:52
Logged In: YES user_id=6380 Fixed in CVS.
History
Date User Action Args
2022-04-10 16:05:54 admin set github: 37496
2002-11-18 18:36:27 scott_daniels create