Issue 3535: zipfile has problem reading zip files over 2GB (original) (raw)

Created on 2008-08-10 09:27 by alonwas, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
large.c alonwas,2008-08-13 06:15
largezip.patch pitrou,2008-08-17 12:19
Messages (14)
msg70968 - (view) Author: (alonwas) Date: 2008-08-10 09:27
zipfile complains about "Bad magic number for central directory" when I give it files over 2GB. I believe the problem is that the offset for the central directory should be read as an unsigned long rather than as a signed long. Modifying structEndArchive from "<4s4H2lH" to "<4s4H2LH" (note the capital L) should probably fix it. When the offset is >2^31 you get a negative offset and the code fails to find the central directory. I'll appreciate it if someone more knowledgeable looks at the problem and the suggested fix, Thanks, Alon
msg70987 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-08-10 17:51
What Python version exactly are you using? This might have been fixed in 2.5.2, with r60117.
msg71003 - (view) Author: (alonwas) Date: 2008-08-11 08:41
Hi, I'm using 2.5.2 (r252:60911), Thanks, Alon On Sun, 2008-08-10 at 17:51 +0000, Martin v. Löwis wrote: > Martin v. Löwis <martin@v.loewis.de> added the comment: > > What Python version exactly are you using? This might have been fixed in > 2.5.2, with r60117. > > ---------- > nosy: +loewis > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue3535> > _______________________________________
msg71025 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-11 17:29
Do you have a public URL for such a zip file?
msg71076 - (view) Author: (alonwas) Date: 2008-08-13 06:15
Hi Antoine, The problem happens for files between 2GB and 4GB. I can't really send you a link to such a big file. To reproduce the problem, you can generate one. I created (and attach) a tiny C program that helps generate one. If you want to, you can run it, save its output to a file and then add it to a zip file (it should compress around 12%). The resulting zip file will fail to open from python using the zipfile package because of the bug I mentioned. Please let me know whether this is enough information to reproduce, Thanks, Alon On Mon, 2008-08-11 at 17:30 +0000, Antoine Pitrou wrote: > Antoine Pitrou <pitrou@free.fr> added the comment: > > Do you have a public URL for such a zip file? > > ---------- > nosy: +pitrou > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue3535> > _______________________________________
msg71101 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-13 22:11
> The problem happens for files between 2GB and 4GB. I can't really send > you a link to such a big file. To reproduce the problem, you can > generate one. The problem is that the "zip" command fails to create a zip file larger than 2GB (I get "zip I/O error: Invalid argument"). And even if it didn't fail the internal structure of the zip file might not be exactly the same as with other compression tools. That's why I was asking you for an existing file. If I give you an ssh/sftp access somewhere, would you be able to upload such a file?
msg71265 - (view) Author: (alonwas) Date: 2008-08-17 10:48
Antoine, I had a similar problem with zip version 2.32, but this is fixed in version 3.0 (or on 64-bit architectures). Would you be able to give it a try with the newer version (which can be obtained from info-zip.org)? Unfortunately, my upload bandwidth will not allow me to upload such a big file. Thanks, Alon On Wed, 2008-08-13 at 22:11 +0000, Antoine Pitrou wrote: > Antoine Pitrou <pitrou@free.fr> added the comment: > > > The problem happens for files between 2GB and 4GB. I can't really send > > you a link to such a big file. To reproduce the problem, you can > > generate one. > > The problem is that the "zip" command fails to create a zip file larger > than 2GB (I get "zip I/O error: Invalid argument"). And even if it > didn't fail the internal structure of the zip file might not be exactly > the same as with other compression tools. That's why I was asking you > for an existing file. > > If I give you an ssh/sftp access somewhere, would you be able to upload > such a file? > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue3535> > _______________________________________
msg71269 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-17 12:19
Alon, can you try with the following patch? It seems to fix it here.
msg72590 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-09-05 13:14
Alan, do you have an opinion on this?
msg72630 - (view) Author: Alan McIntyre (alanmcintyre) * (Python committer) Date: 2008-09-05 21:25
Your patch seems like a better way to detect whether a file is written as Zip64, and it seems to be able to properly handle extracting a >2GB file from a >2GB archive, so I'd vote to include it. I tested it with r66233, using a file made from the output of large.c, zipped with the built-in archiver on OS X 10.4.11. All regression tests pass, including test_zipfile64, on both Linux and OS X.
msg72631 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-09-05 21:34
Alan, do you have commit access? Otherwise the patch needs approval from another core developer.
msg72632 - (view) Author: Alan McIntyre (alanmcintyre) * (Python committer) Date: 2008-09-05 21:40
No, I don't have commit access at the moment.
msg72649 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-09-05 23:21
I also agree with the patch. This seems the correct way to detect the Zip64 format.
msg72651 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-09-05 23:43
Fixed in r66240, r66241. Thanks!
History
Date User Action Args
2022-04-11 14:56:37 admin set github: 47785
2008-09-05 23:43:33 pitrou set status: open -> closedresolution: fixedmessages: +
2008-09-05 23:21:22 amaury.forgeotdarc set nosy: + amaury.forgeotdarcmessages: +
2008-09-05 21:53:24 pitrou set keywords: + needs review
2008-09-05 21:40:48 alanmcintyre set messages: +
2008-09-05 21:34:55 pitrou set messages: +
2008-09-05 21:25:28 alanmcintyre set messages: +
2008-09-05 13:14:05 pitrou set nosy: + alanmcintyremessages: + versions: + Python 3.1, Python 2.7, - Python 2.6, Python 3.0
2008-08-17 12:19:14 pitrou set files: + largezip.patchpriority: normalmessages: + keywords: + patchversions: + Python 2.6, Python 3.0, - Python 2.5
2008-08-17 10:48:20 alonwas set messages: +
2008-08-13 22:11:13 pitrou set messages: +
2008-08-13 06:15:33 alonwas set files: + large.cmessages: +
2008-08-11 17:30:00 pitrou set nosy: + pitroumessages: +
2008-08-11 08:41:07 alonwas set messages: +
2008-08-10 17:51:13 loewis set nosy: + loewismessages: +
2008-08-10 09:27:27 alonwas create