Issue 28719: zipfile increase in size (original) (raw)

Created on 2016-11-16 21:16 by X-Istence, last changed 2022-04-11 14:58 by admin.

Messages (6)
msg280992 - (view) Author: Bert JW Regeer (X-Istence) * Date: 2016-11-16 21:16
I am the current maintainer of WebOb, and noticed that on Python 3.6 and 3.7 I noticed that a test started failing. Granted, the test is checking the size of the file created and it is not the brightest idea in a test, but it's been stable since Python 2.5... https://travis-ci.org/Pylons/webob/jobs/176505096#L224 shows the failure. _________________________ test_response_file_body_tell _________________________ def test_response_file_body_tell(): import zipfile from webob.response import ResponseBodyFile rbo = ResponseBodyFile(Response()) assert rbo.tell() == 0 writer = zipfile.ZipFile(rbo, 'w') writer.writestr('zinfo_or_arcname', b'foo') writer.close() > assert rbo.tell() == 133 E assert 145 == 133 E + where 145 = <bound method ResponseBodyFile.tell of <body_file for <Response at 0x7fa6291f9eb8 200 OK>>>() E + where <bound method ResponseBodyFile.tell of <body_file for <Response at 0x7fa6291f9eb8 200 OK>>> = <body_file for <Response at 0x7fa6291f9eb8 200 OK>>.tell tests/test_response.py:608: AssertionError I am not sure that this is necessarily a bug, but it would be good to know why files are no longer generated the same way.
msg281000 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-16 22:23
Could you get a dump of rbo data?
msg281001 - (view) Author: Bert JW Regeer (X-Istence) * Date: 2016-11-16 22:30
It's literally the string written: writer.writestr('zinfo_or_arcname', b'foo') rbo in this case is a simple file like object. I can get dumps from Python 3.5 and Python 3.6 if necessary.
msg281004 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-16 22:45
Please make a dump. It should include not just literally the string written, but headers and other special fields. I tried with rbo = io.BytesIO(), and get rbo.tell() == 133. Should be a difference between io.BytesIO and ResponseBodyFile. Maybe ResponseBodyFile is not seekable.
msg281006 - (view) Author: Bert JW Regeer (X-Istence) * Date: 2016-11-16 22:58
Here's a dump from Python 3.6: b'PK\x03\x04\x14\x00\x08\x00\x00\x00\xc0~pI\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00zinfo_or_arcnamefoo!es\x8c\x03\x00\x00\x00\x03\x00\x00\x00PK\x01\x02\x14\x03\x14\x00\x08\x00\x00\x00\xc0~pI!es\x8c\x03\x00\x00\x00\x03\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\x01\x00\x00\x00\x00zinfo_or_arcnamePK\x05\x06\x00\x00\x00\x00\x01\x00\x01\x00>\x00\x00\x00=\x00\x00\x00\x00\x00' You are correct that ResponseBodyFile does not have a seek() method and is not seekable. Adding seek() to ResponseBodyFile might be a little more complicated...
msg281212 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-19 09:14
If the output file is not seekable, zipfile sets bit 3 in file header flags and writes 12 or 20 (if ZIP64 extension is used) additional bytes after the compressed data. These bytes contain the CRC, compressed and uncompressed sizes. Corresponding fields in local file header are set to zero. In case of writestr() this can be considered as a regression, since the CRC and sizes can be calculated before writing compressed data and saved in local file header. But it would be not easy to fix this.
History
Date User Action Args
2022-04-11 14:58:39 admin set github: 72905
2016-11-19 09:14:23 serhiy.storchaka set messages: + assignee: serhiy.storchakacomponents: + Library (Lib)type: resource usage -> behaviorstage: needs patch
2016-11-16 22:58:14 X-Istence set messages: +
2016-11-16 22:45:17 serhiy.storchaka set messages: +
2016-11-16 22:30:25 X-Istence set messages: +
2016-11-16 22:23:20 serhiy.storchaka set nosy: + serhiy.storchakamessages: +
2016-11-16 21:16:06 X-Istence create