JDK7's java.util.zip breakage with very large files (original) (raw)

Xueming Shen xueming.shen at oracle.com
Thu Feb 7 20:11:38 UTC 2013


Alexander,

Can you do "jar xvf data.jar data" to extract the file?

-Sherman

On 02/07/2013 08:54 AM, Alexander Sack wrote:

Folks:

What I am trying to do is generate Zip64 extensions within a JAR file and then dissect the zip contents (end of directory records, file headers, etc.). However, when I use jar or a small program that I wrote which uses java.util.zip to zip up a very large file>12G, I do not get the expected output. Despite the fact that jar succeeds, the zip binary created does not have an End of Directory (EoD) record at all! (like ZipOutStream.finish() was never called). I am able to extract the large file and verify its MD5 which is correct. So I am doing this (data is 12G): - md5sum data - jar cvf data.jar data [wait a while, out is around 2.3G, return code is 0] - bvi data.jar (look for EoD at end of jar file, magic 0x06054B50 or even the zip64 (EoD) locator/record signatures) Not found! (bummer) Extract: - jar tvf data.jar -> I see the correct size which means jar is reading the 64-bit sizes correctly, earlier builds (<b55 I think) I would see -1. - jar xvf data.jar - md5sum data - Matches original data I noticed that after the deflate compressed blocks, the file is appended with a lot of zeros (I originally thought it got truncated but from the above extraction test, that is not the case). This is on a x86-64 Fedora 13 system using yesterday's JDK7 build tree (I downloaded the build infrastructure and set it to download bundles during the build - I had no build failures). Why for very large files does jar (java.util.zip) output a non-standard zip file, i.e. no EoD record and friends? I have just begun to look at the actual code to see whether this is pilot error on my part or something else a foot (my code calls zos.finish() explicitly which has no effect - not sure where jar calls it just yet from ZipOutputStream.finish()). Thanks! -aps



More information about the core-libs-dev mailing list