Issue 7540: urllib2 request does not update content length after new add_data (original) (raw)

Created on 2009-12-18 15:24 by till, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
bad_fix.diff pablomouzo,2009-12-27 17:18 This fix breaks the tests
Messages (16)
msg96567 - (view) Author: Till Maas (till) Date: 2009-12-18 15:23
When I try to reuse a urllib2.Request object with different post data, the data itself is updated, but the content length is not. Here is a simple script to reproduce it on Python 2.5 on Fedora 10 and 2.6 on Arch Linux: #!/usr/bin/python # vim: fileencoding=utf8 # test with: echo | socat - tcp4-listen:1111,fork # Demonstrates bad content length of second request, which should be 2 import urllib2 req = urllib2.Request('http://localhost:1111') req.add_data("1") urllib2.urlopen(req) req.add_data("10") urllib2.urlopen(req)
msg96915 - (view) Author: Pablo Mouzo (pablomouzo) Date: 2009-12-27 17:18
The problem here is that the headers are not updated if they already exists. The solution is quite simple but breaks the tests because it "clobbers the existing headers". You can do this: ... req.add_data(some_data) req.add_unredirected_header('Content-Length', len(some_data)) urllib2.urlopen(req) ... But is risky because all the other headers are still outdated. Is there any reason why you need to reuse the request object?
msg96920 - (view) Author: Till Maas (till) Date: 2009-12-27 20:27
I do not need to reuse a request object, but I did in a script when only the data was different for each request. If this is not meant to be done, then any not meant to be done modification should somehow create an error, when it is done, instead of silently performing broken HTTP requests.
msg100067 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-02-24 20:56
Reusing a request object and doing add_data is a bad idea. Made changes to raise TypeError when you try to that. Fix committed in revision 78431.
msg100068 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-02-24 21:05
merged into other branches r78432, r78433, r78434.
msg100353 - (view) Author: Fred Drake (fdrake) (Python committer) Date: 2010-03-03 22:03
This change breaks existing uses of Python 2.6.4. The mechanize library frequently re-initializes the data in the request without re-using the request. Applications (including tests) that use mechanize now break with this TypeError. The proper response to this issue for Python 2.6 is to make no code changes (though a documentation enhancement may be in order). This change should be reverted from all branches. Python 2.6.5 should be blocked until this is done.
msg100354 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2010-03-03 22:08
Since this is a regression from 2.6.4, it's a release blocker for 2.6.5 and probably will require an rc2.
msg100356 - (view) Author: Fred Drake (fdrake) (Python committer) Date: 2010-03-03 22:10
To clarify: Multiple calls to add_data on a urllib2 request, when the request isn't being reused, are in no way invalidated by the problem initially reported.
msg100374 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-03-04 04:03
fdrake and barry: Shall I revert the changes made in the release26-maint branches then? I documentation fix in those branches may be sufficient with warning in the documentation.
msg100375 - (view) Author: Fred Drake (fdrake) (Python committer) Date: 2010-03-04 06:18
I'd like to hear at least one other say-so (which can be yours if you agree this is the right thing), so there's more recognized consensus on the matter. We also need an explicit go-ahead from Barry as the release manager. At this point, the community sees only my assertion that this is a regression from 2.6.4. I'd be fine with a documentation improvement going in as well (as a separate commit), but that's not a requirement to deal with the release block. Before the release, it would require a separate go-ahead from the release manager.
msg100385 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-03-04 12:30
Fred, I think your report of observed breakage is sufficient to warrant rolling back the change, especially since this isn't an actual bug fix. That is, no code that was failing to work before would be enabled to work by this fix. (Rather, it aims to prevent broken code from being written...which it could be argued is inappropriate for a backport anyway, though you could argue it either way.)
msg100387 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2010-03-04 14:14
Please roll this back for 2.6.5rc2. Thanks.
msg100389 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-03-04 15:20
Reverted the changes in the revision 78651 No longer a release blocker for rc2.
msg100408 - (view) Author: Fred Drake (fdrake) (Python committer) Date: 2010-03-04 19:14
Reopening; per discussion on IRC, the change needs to be reverted on the other three branches to which it was applied. If code changes are needed to make unsupported usage fail early, they need to be considered carefully and only applied as a new feature.
msg100561 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-03-07 04:39
Change is reverted from other branches too. Discussed in IRC, that changes made to an existing API like add_data,may break some existing applications, even if its done to prevent unsupported usage. Specifically, in mechanize's case, it was not reusing req object, but using add_data method more than once before req was getting submitted. This is under scenarios wherein code as in zope.testbrowser) is centered on the browser controls; any twiddle to a control could update the req object before its submitted. This could be changed in mechanize, but in general agreed to fact that changes to existing API should not result in breaking of apps. We can address the non-reusablity of request by adding a boolean that signifies if a request has been used and throw an exception if one is used a second time ( benji's suggestion). And of course clarify the supported and intended scenarios via documentation.
msg100785 - (view) Author: Fred Drake (fdrake) (Python committer) Date: 2010-03-10 13:56
The reverts in SVN look good to me; re-closing this issue as no further action is required. If there's a proposal for specific changes to urllib2 to improve diagnostics in unsupported cases, that should be detailed in a separate issue. Marking this issue "invalid" since the original reported symptom was determined to be caused by an unsupported use (re-use of the request).
History
Date User Action Args
2022-04-11 14:56:55 admin set github: 51789
2010-04-27 20:33:34 loewis set priority: normal
2010-03-10 13:56:24 fdrake set status: open -> closedresolution: accepted -> not a bugmessages: +
2010-03-07 04:39:12 orsenthil set resolution: fixed -> acceptedmessages: +
2010-03-04 19:14:45 fdrake set status: closed -> openmessages: +
2010-03-04 15:20:48 orsenthil set status: open -> closedpriority: release blocker -> (no value)resolution: accepted -> fixedmessages: +
2010-03-04 14:14:55 barry set resolution: fixed -> acceptedmessages: +
2010-03-04 12:30:41 r.david.murray set nosy: + r.david.murraymessages: +
2010-03-04 06🔞07 fdrake set messages: +
2010-03-04 04:03:29 orsenthil set messages: +
2010-03-03 22:10:56 fdrake set messages: +
2010-03-03 22:08:50 barry set priority: release blockernosy: + barrymessages: +
2010-03-03 22:03:48 fdrake set status: closed -> opennosy: + fdrakemessages: +
2010-02-24 21:05:45 orsenthil set status: open -> closedmessages: +
2010-02-24 20:56:54 orsenthil set resolution: fixedmessages: +
2009-12-27 20:27:30 till set messages: +
2009-12-27 17🔞18 pablomouzo set files: + bad_fix.diffnosy: + pablomouzomessages: + keywords: + patch
2009-12-25 07:22:44 nirai set nosy: + nirai
2009-12-21 03:07:15 orsenthil set assignee: orsenthilnosy: + orsenthil
2009-12-18 15:24:00 till create