Seems like the 'reporthook', when provided to the 'retrieve' function of urllib will be called with (blocknumber, blocksize, totalsize). However, the call to read() might return *less* than 'blocksize' bytes (I believe), so it should imho be called with len(block) as the second argument instead. Or maybe add a fourth argument 'readbytes'.
Attached patch adds new argument: progresshook - it will be passed two arguments: count of downloaded bytes and total file size (or -1 if it's not available). Introducing new argument instead of modifying reporthook maintains backwards compatibility, also allows removal of reporthook at one point in the future. This patch is against r30 SVN tag.
Simple (lazy) test case added. It just replicates one test case of reporthook to work with progresshook. The testcases assume the hard-coded value of blocksize on urllib, maybe it should become a public property. Also commented on diff: http://bugs.python.org/review/1490929/show
> so was this fixed? Irit, not really. This is adding another hook called "progress hook" in addition to the "report hook". We need to evaluate if this is really required.