Issue 810023: Fix for off-by-one bug in urllib.URLopener.retrieve (original) (raw)

This patch fixes an off-by-one bug in the reporthook part of urllib.URLopener.retrieve and adds test cases to verify the fix.

The retrieve method reports reading one more block than it actually does. Here's an example. The file is empty. I expect the initial reporthook call on establishment of the network connection but not an additional call since the file is empty (no block was read):

import urllib def reporter(count, blockSize, fileSize): ... print "c: %i, bs: %i, fs: %i" % (count, blockSize, fileSize) ... srcFile = file("/tmp/empty.txt", "wb") srcFile.close() urllib.urlretrieve("file:///tmp/empty.txt", "/tmp/new- empty.txt", reporter) c: 0, bs: 8192, fs: 0 c: 1, bs: 8192, fs: 0 ('/tmp/new-empty.txt', <mimetools.Message instance at 0x5a4f58>)

As a second example, if the file contains 1 byte, the retrieve method claims it read 2 blocks when it really only read 1:

srcFile = file("/tmp/empty.txt", "wb") srcFile.write("x") srcFile.close() urllib.urlretrieve("file:///tmp/empty.txt", "/tmp/new- empty.txt", reporter) c: 0, bs: 8192, fs: 1 c: 1, bs: 8192, fs: 1 c: 2, bs: 8192, fs: 1 ('/tmp/new-empty.txt', <mimetools.Message instance at 0x5a50d0>)

This patch also includes some changes to test_urllib.urlretrieve_FileTests (where I added the test cases) to support creating more than one temporary file and making sure any created temporary files get deleted when the test case fixture is torn down.