Issue 14398: bz2.BZ2DEcompressor.decompress fail on large files (original) (raw)

Created on 2012-03-24 16:15 by Laurent.Gautier, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
testbz2.py nadeem.vawda,2012-03-24 17:33
Messages (18)
msg156698 - (view) Author: Laurent Gautier (Laurent.Gautier) Date: 2012-03-24 16:15
The call ends with: Objects/stringobject.c:3884: bad argument to internal function sys.version: '2.7.2 (default, Jun 13 2011, 15:14:50) \n[GCC 4.4.5]' (on 64bit Linux)
msg156701 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012-03-24 16:36
I can't reproduce this. Can you please provide a test script along with input data that allows us to reproduce this error?
msg156705 - (view) Author: Laurent Gautier (Laurent.Gautier) Date: 2012-03-24 16:45
Wow! Quick follow-up. The data file is about 1.6Gb. Is there a preferred way to pass it on (I suspect that the bug tracker is not the preferred way). The code goes like: import bz2 f = file("foobar.bz2", mode="rb") src_buf = f.read() decomp = bz2.BZ2Decompressor() tmp = decomp.decompress(src_buf)
msg156709 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-03-24 17:33
I have been able to reproduce it; see attached script. It happens for inputs of 2GB (decompressed), but not for ones of 1GB. It seems that bz2module.c doesn't guard against 32-bit overflows when handling the size of the decompressed data. This affects both the BZ2Decompressor object's decompress() method, and the module-level decompress() function. All python versions prior to 3.3 are affected.
msg156710 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-03-24 17:35
(the contents of the input file don't matter; I just pulled out a bunch of zeros from /dev/zero and compressed them with bzip2.)
msg156711 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-03-24 17:52
This should be fixed for 2.7.3. I'll have a patch ready in the next day or two.
msg156713 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2012-03-24 19:31
This isn't a regression, is it? If it's not, I don't think it's essential to get into 2.7.3.
msg156714 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-03-24 19:35
No, it's been around since at least 2.6. I wasn't really sure what the protocol was for bugs found during the RC process. It'd be nice to get a fix for this into 2.7.3 (and 3.2.3), but it's not urgent.
msg156715 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012-03-24 19:37
Nadeem: the final release candidate of 2.7.3 was already made. Any further change would require another release candidate, which in turn would delay the release further. This has to wait for 2.7.4.
msg156717 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-03-24 19:38
That's fine by me, then. Sorry for the confusion.
msg173471 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-10-21 19:22
New changeset ebb8c7d79f52 by Nadeem Vawda in branch '3.2': Issue #14398: Fix size truncation and overflow bugs in bz2 module. http://hg.python.org/cpython/rev/ebb8c7d79f52 New changeset 25fdf297c077 by Nadeem Vawda in branch '3.3': Merge #14398: Fix size truncation and overflow bugs in bz2 module. http://hg.python.org/cpython/rev/25fdf297c077 New changeset d6bf506ea13f by Nadeem Vawda in branch 'default': Merge #14398: Fix size truncation and overflow bugs in bz2 module. http://hg.python.org/cpython/rev/d6bf506ea13f
msg173479 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-10-21 20:20
What about 2.7?
msg173481 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-10-21 20:30
I'm working on it now. Will push in the next 15 minutes or so.
msg173483 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-10-21 21:09
New changeset f03a335621ce by Nadeem Vawda in branch '2.7': Issue #14398: Fix size truncation and overflow bugs in bz2 module. http://hg.python.org/cpython/rev/f03a335621ce
msg173484 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-10-21 21:12
All fixed, along with some other similar but harder-to-trigger bugs. Thanks for the bug report, Laurent!
msg187083 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2013-04-16 14:16
Why does only 2.7 have tests?
msg187298 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2013-04-18 21:40
An oversight on my part, I think. I'll add tests for 3.x this weekend.
msg187533 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2013-04-21 22:30
Hmm, so actually most of the bugs fixed in 2.7 and 3.2 weren't present in 3.3 and 3.4, and those versions already had tests equivalent to the tests I added for 2.7/3.2. As for the changes that I did make to 3.3/3.4: - two of the three cover cases that only occur if the output data is larger than ~32GiB. Even if we have a buildbot with enough memory for it (which I don't think we do), actually running such tests would take forever and then some. - the third is for a condition that's actually pretty much impossible to trigger - grow_buffer() has to be called on a buffer that is already at least 8*((size_t)-1)/9 bytes long. On a 64-bit system this is astronomically large, while on a 32-bit system the OS will probably have reserved more than 1/9th of the virtual address space for itself, so it won't be possible to allocate a large enough buffer.
History
Date User Action Args
2022-04-11 14:57:28 admin set github: 58606
2013-04-21 22:30:05 nadeem.vawda set status: open -> closedmessages: +
2013-04-18 21:40:44 nadeem.vawda set status: closed -> openmessages: +
2013-04-16 14:16:35 benjamin.peterson set messages: +
2012-10-21 21:12:54 nadeem.vawda set status: open -> closedresolution: fixedmessages: + stage: needs patch -> resolved
2012-10-21 21:09:42 python-dev set messages: +
2012-10-21 20:30:19 nadeem.vawda set messages: +
2012-10-21 20:20:08 serhiy.storchaka set nosy: + serhiy.storchakamessages: + versions: + Python 3.3, Python 3.4
2012-10-21 19:22:09 python-dev set nosy: + python-devmessages: +
2012-03-24 19:38:53 nadeem.vawda set messages: +
2012-03-24 19:37:11 loewis set messages: +
2012-03-24 19:35:51 nadeem.vawda set priority: release blocker -> normalmessages: +
2012-03-24 19:31:42 benjamin.peterson set messages: +
2012-03-24 17:52:25 nadeem.vawda set priority: normal -> release blockernosy: + georg.brandl, benjamin.petersonmessages: +
2012-03-24 17:39:16 nadeem.vawda set versions: + Python 3.2
2012-03-24 17:35:02 nadeem.vawda set messages: +
2012-03-24 17:33:46 nadeem.vawda set files: + testbz2.pyassignee: nadeem.vawdacomponents: + Extension Modulesnosy: + nadeem.vawdamessages: + stage: needs patch
2012-03-24 16:45:34 Laurent.Gautier set messages: +
2012-03-24 16:36:13 loewis set nosy: + loewismessages: +
2012-03-24 16:15:19 Laurent.Gautier create