Issue 22908: ZipExtFile in zipfile can be seekable (original) (raw)

Created on 2014-11-20 15:20 by Iridium.Yang, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
zipfile.diff Iridium.Yang,2014-11-20 15:20
ziz.py jjolly,2017-12-22 13:44 zip-in-zip test program
Pull Requests
URL Status Linked Edit
PR 4966 merged python-dev,2017-12-21 18:29
Messages (7)
msg231438 - (view) Author: Iridium Yang (Iridium.Yang) Date: 2014-11-20 15:20
The ZipExtFile class in zipfile module does not provide a seek method like GzipFile. As a result, it is hard to manipulate files without extract all the content. For example, a very large tar file compressed with zip. The TarFile module can operate on file object, but need seek method. So the ZipExtFile instance return from ZipFile can not passed into TarFile. This may seem strange but I encounter this on Samsung firmware. In fact, the seek method in GzipFile or someother compressed format can be implemented in zipfile very easily. Here is my naive modification (nearly same as in GzipFile)
msg231446 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-11-20 18:20
I'm -1 on adding the seek method with linear complexity. This looks as attractive nonsense to me. It would be better just make TarFile working with non-seekable streams.
msg231472 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-11-21 14:08
Actually TarFile already works with non-seekable streams. Use TarFile.open() with mode='r|*' or like. On other hand I'm not against the make non-compressed ZipExtFile seekable. It can be helpful in case when ZIP file is used just as a container for other files.
msg256683 - (view) Author: Daniel Kessel (dkessel) Date: 2015-12-18 14:24
It would be great to have the ZipFileExt class seekable. This would help in implementing features in other projects. For example, pydicom would gain the ability to read from ZIP files, as mentioned in https://github.com/darcymason/pydicom/issues/219
msg268764 - (view) Author: Jürgen A. Erhard (jae) Date: 2016-06-18 05:02
To add to this (without looking at the patch): I just to my astonishment learned that a ZipExtFile doesn't even support tell(). I can understand the seek being nontrivial... but the tell? It's a bytestream, and there is (isn't there?) a clear definition of what next byte a read(1) would deliver. It should be trivial to keep track of the (only ever increasing) file position.
msg308935 - (view) Author: John Jolly (jjolly) * Date: 2017-12-22 13:44
Please be gentle, this is my first submission to python. The use case for me was a recursive zip-within-a-zip situation. I wanted to allow the creation of a zipfile.ZipFile from an existing zipfile.ZipExtFile, but the lack of seek prevented this. I simply treated forward seeks as a read, and backward seeks as a reset-and-read. The reset was the tricky part as it required restoring several original values such as the remaining compressed length, remaining data length, and the running crc32. I pushed this into the latest upstream branch, but as I am testing this in v3.4 it should be easy to backport if necessary (I suspect not). I based my fix on a little program that I wrote to test the feasibility of this idea. I am attaching that test program here.
msg311254 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2018-01-30 08:51
New changeset 066df4fd454d6ff9be66e80b2a65995b10af174f by Gregory P. Smith (John Jolly) in branch 'master': bpo-22908: Add seek and tell functionality to ZipExtFile (GH-4966) https://github.com/python/cpython/commit/066df4fd454d6ff9be66e80b2a65995b10af174f
History
Date User Action Args
2022-04-11 14:58:10 admin set github: 67097
2018-01-30 08:53:26 gregory.p.smith set status: open -> closedstage: patch review -> commit reviewresolution: fixedversions: + Python 3.7, - Python 3.5
2018-01-30 08:51:42 gregory.p.smith set messages: +
2018-01-30 08:45:02 gregory.p.smith set assignee: serhiy.storchaka -> gregory.p.smithnosy: + gregory.p.smith
2017-12-22 13:44:50 jjolly set files: + ziz.pynosy: + jjollymessages: +
2017-12-21 18:29:06 python-dev set stage: needs patch -> patch reviewpull_requests: + <pull%5Frequest4858>
2016-06-18 05:02:59 jae set nosy: + jaemessages: +
2015-12-18 14:24:16 dkessel set nosy: + dkesselmessages: +
2014-11-21 14:08:37 serhiy.storchaka set assignee: serhiy.storchakastage: needs patchmessages: + versions: + Python 3.5, - Python 3.4
2014-11-20 18:20:45 serhiy.storchaka set nosy: + serhiy.storchakamessages: +
2014-11-20 15:20:43 Iridium.Yang create