Issue 10050: urllib.request still has old 2.x urllib primitives (original) (raw)

process

Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: akira, eric.araujo, ezio.melotti, jhylton, mcjeff, nadeem.vawda, neologix, orsenthil, pitrou, python-dev, techtonik
Priority: normal Keywords: patch

Created on 2010-10-08 11:14 by pitrou, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue10050.patch mcjeff,2011-03-18 22:57 review
issue10050.patch mcjeff,2011-03-19 01:18 Second Revision review
issue10050.patch mcjeff,2011-03-19 01:47 Third Update - Change Synopsis review
issue10050.patch mcjeff,2011-03-20 03:03 Applied Antoine's suggestions review
issue10050-deprecation.patch orsenthil,2012-03-14 06:33 review
Messages (44)
msg118182 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-08 11:14
3.x urllib.request still has a lot of stuff inherited from 2.x urllib: URLopener, FancyURLopener, the urlretrieve implementation (perhaps others?). Ideally, urlretrieve should be reimplemented using urlopen or build_opener, and the other stuff deprecated.
msg131369 - (view) Author: Jeff McNeil (mcjeff) * Date: 2011-03-18 22:57
Alright, attaching a patch that reworks urlretrieve to use urlopen internal to urllib.request. 1. I dropped the local caching as it isn't turned on by default anyway (and isn't really documented). 2. Updated documentation to reflect caching changes & make urlretrieve part of the official API again. 3. Kept the urlcleanup function, but use a global list to track temporary files. I'd be happy to change this functionality if that makes sense. 4. After moving the urlretrieve stuff out of test_urllibnet, I realized that file didn't serve much of a purpose any longer, so I just removed it. 5. Updated NEWS. I'd be happy to rework any of this in order to bring it up to stuff. Comments and suggestions are very much welcomed.
msg131371 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-03-18 23:48
Follow the “review” link next to the patch for an initial review.
msg131373 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2011-03-19 00:21
Significant patch. Thanks. I looked at the review too. For the Context Manager part, don't club it along with this one. We need to test this thoroughly, after this is in shape, that can be addressed.
msg131375 - (view) Author: Jeff McNeil (mcjeff) * Date: 2011-03-19 01:18
Made recommended changes. Moved to NamedTemporaryFile. I don't think the spooled file makes sense here as the existing protocol provides a filename in the returned tuple, not a f.l.o. As far as the description? Here are a couple suggestions: 1. URL Retrieval Library 2. URL Access Module Updated the module documentation as well as the howto.
msg131376 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2011-03-19 01:21
Changing the description is a minor update. The term 'URL access module' seems fine.
msg131380 - (view) Author: Jeff McNeil (mcjeff) * Date: 2011-03-19 01:47
Made requested change to Synopsis/Description.
msg131411 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-03-19 13:13
I don't think the tests should be moved from one file to the other. It's making more difficult to tell whether you have changed them or not. I think moving the tests (as well as changing the synopsis, hello Eric) are cosmetic changes that are better done in separate changesets. Some other things: - please do the "import tempfile" at the top-level. Imports from functions are generally frown upon, unless otherwise necessary. - instead of "try ... finally: tfp.close()", you can simply write "with tfp: ..." - I don't understand why `size` is read only when a reporthook is set, while it is always used for raising ContentTooShortError - I'm not sure why the reporthook is called with `bs` rather than `len(block)`. I think the user is more interested in the actual number of bytes. - you don't need to update Misc/NEWS yourself
msg131419 - (view) Author: Jeff McNeil (mcjeff) * Date: 2011-03-19 14:16
I'll make those changes, sure. I had the same thought re: block size, but I was trying to keep inline with what the current function did.
msg131471 - (view) Author: Jeff McNeil (mcjeff) * Date: 2011-03-20 03:03
Take four! Includes Antoine's suggestions. I changed the callback to return (block num, read size, file size) as opposed to (block num, block size, file size) as this seems to make more sense. I appreciate the back and forth. I'd be happy to create issues & handle the other things that have been moved out of this patch.
msg131476 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2011-03-20 04:30
Thanks again. Agree to use of returning the block size to the returnhook callable. But this is going be a deviation from the existing behavior which was present to inform the user at regular intervals and did not bother to provide any 'progressive size' related data, so this change should documented as new behavior. With respect to the docs patch in urllib2 howto I would do something like this. with open(local_filename) as fh: content = fh.read() Or *better* would demonstrate the filename argument passing and reporthook functionality. If you wish, take this part alone, (the howto document) update as separate patch, which should be committed once this is in. There is one comment which I forgot to mention earlier. The current urlretrieve function is internally calling the URLOpener's retrieve method. Those URLOpener class might need a DeprecationWarning while address thing bug and that particular retrieve method can be updated to use the updated facility just as bonus till the time it survives. Antoine - any suggestions on the last point?
msg131488 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-03-20 12:10
> There is one comment which I forgot to mention earlier. > > The current urlretrieve function is internally calling the URLOpener's > retrieve method. > > Those URLOpener class might need a DeprecationWarning while address > thing bug and that particular retrieve method can be updated to use > the updated facility just as bonus till the time it survives. > > Antoine - any suggestions on the last point? Can you clarify the issue?
msg131490 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2011-03-20 12:51
Antoine Pitrou wrote: > Can you clarify the issue? URLopener which is an old class from the merge of urllib and urllib2 and it can be slowly and safely removed. If we go this line, then I assume it has to have a DeprecationWarning before we remove it. Should we or not? My thought is, in Python 3.x these may not be in use, but still if we remove without DeprecationWarning, it could raise concerns with some folks (given some discussion about this recently when people trying to upgrade from 2.x to 3.x). At the moment, urlretrieve function is calling URLopener.retrieve internally. But this patch addresses urlretrive function at much higher level and leaves the URLopener.retrieve in a hanging state. So, I was thinking how to handle this scenario. Change URLopener.retrieve also with the cleaner and modern code, but and add a DeprecationWarning in those.
msg131491 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-03-20 13:06
> URLopener which is an old class from the merge of urllib and urllib2 > and it can be slowly and safely removed. If we go this line, then I > assume it has to have a DeprecationWarning before we remove it. Should > we or not? Yes, we should. > Change URLopener.retrieve also with the cleaner and modern code, but > and add a DeprecationWarning in those. Sounds overkill and of questionable interest. Honestly, I don't think URLopener.retrieve() has much point anyway. Perhaps it would have if the whole caching thing had been implemented.
msg131505 - (view) Author: Jeff McNeil (mcjeff) * Date: 2011-03-20 16:21
I'm not exactly sure what the steps are with respect to the DeprecationWarning. Is the common case just to raise the warning in the __init__ method? Are there related documentation changes? Thanks again! Learning a ton. Hopefully the next patch I submit will go much smoother.
msg131511 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-03-20 16:52
Antoine: you’re right about not mixing concerns, sorry. Regarding deprecation, we should be cautious and maybe plan it across more than two versions. See thread starting at http://mail.python.org/pipermail/python-dev/2011-March/109450.html
msg131718 - (view) Author: Jeff McNeil (mcjeff) * Date: 2011-03-22 01:36
Just wanted to check so this doesn't sit with people waiting on me. Is there anything else I need/should do to this patch? Little unclear on how to handle the deprecation process.
msg131719 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2011-03-22 01:42
No, it is not waiting for you, in fact it is waiting for me to commit the portion which you have already contributed with some additional code (DeprecationWarning). But this bug does not end up here as there are other primitives which may need cleanup and have DeprecationWarnings. Those needs to be addressed as well, either as part of this issue or as separate issues.
msg132617 - (view) Author: Jeff McNeil (mcjeff) * Date: 2011-03-30 22:47
I'd be happy to pick some of that stuff up. I'd like to address separately as it keeps fewer concerns in this one patch. I'll grab them once they're created.
msg132690 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-03-31 17:57
Feel free to create them and set orsenthil as assignee.
msg155709 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-03-14 02:29
New changeset 53715804dc71 by Senthil Kumaran in branch 'default': Issue10050 - urlretrieve uses newer urlopen. reporthook of urlretrieve takes, block number, block read size, file_size http://hg.python.org/cpython/rev/53715804dc71
msg155710 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-03-14 02:32
One at the time - urlretrieve is re-written. Now should add deprecation warnings to things methods to be deprecated / removed.
msg155712 - (view) Author: Jeff McNeil (mcjeff) * Date: 2012-03-14 02:49
I'd be happy to do that (URLopener, to begin with), though I'm not familiar with the usual approach. Is it simply a matter of warning in __init__?
msg155715 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-03-14 03:04
Yes, it is simply a matter of adding warnings.warn in proper classes and methods. I have started on that, Jeff. No problem. I just wanted the changes to be two separate commits. Thanks!
msg155725 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-03-14 06:33
Here is patch which adds DeprecationWarning to URLOpener and (it's subclasses FancyURLopener) as well as some other Request methods. The idea would be, in the next release we remove this old URLOpener and it's subclasses and remove (or make _private) the Request methods. Added tests for DeprecationWarnings too. I am trying to see if more classes/methods from old urllib can be deprecated/removed. But this first patch is for no-doubt ones. If one have any review comments, please provide. Otherwise I shall check it in 3.3.a1. Thanks!
msg155782 - (view) Author: Jeff McNeil (mcjeff) * Date: 2012-03-14 19:06
I don't see why we'd need to make these _private -- they're just accessors/mutators for the most part. I'd be happy to help clean this up if you need it.
msg155787 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-03-14 20:37
My concern was with the method get_full_url. I have seen it being useful in more than one place and it's not just the accessor. Others are just the accessors and those can be safely removed. When get_full_url gives the correct url, taking into consideration the url fragment, I think that should stay. Other's can be cleaned up. Note that in 3.3, we should just be deprecating and as soon as 3.3 is out, we can remove these. Thanks, Senthil
msg155788 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-03-14 20:44
New changeset eab274c7d456 by Senthil Kumaran in branch 'default': deprecated the old urllib primitives in 3.3 urllib package - issue 10050 http://hg.python.org/cpython/rev/eab274c7d456
msg155792 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-03-14 21:00
New changeset 30f13d7fecd0 by Senthil Kumaran in branch 'default': Fix the buildbot breakdown - issue 10050 http://hg.python.org/cpython/rev/30f13d7fecd0
msg156741 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012-03-25 08:34
Since eab274c7d456, all the buildbots are randomly failing, e.g. """ ====================================================================== FAIL: test_method_deprecations (test.test_urllib2.OpenerDirectorTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/buildbot/buildarea/3.x.parc-snowleopard-1/build/Lib/test/test_urllib2.py", line 622, in test_method_deprecations req.add_data("data") File "/Users/buildbot/buildarea/3.x.parc-snowleopard-1/build/Lib/contextlib.py", line 54, in __exit__ next(self.gen) File "/Users/buildbot/buildarea/3.x.parc-snowleopard-1/build/Lib/test/support.py", line 766, in _filterwarnings missing[0]) AssertionError: filter ('', DeprecationWarning) did not catch any warning """ http://python.org/dev/buildbot/all/builders/AMD64%20Snow%20Leopard%202%203.x/builds/231/steps/test/logs/stdio
msg156747 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-03-25 09:23
It's funny, Antoine had raised an issue too. Should see why is this sporadic failures..
msg156759 - (view) Author: Jeff McNeil (mcjeff) * Date: 2012-03-25 16:39
I was looking at a somewhat unrelated issue and I bumped up against a similar situation with the warnings system. I didn't look too much into it, but it appeared that warnings didn't get added to __warningregistry__ correctly. Though, when I set the stack level explicitly (to an incorrect value for the issue at hand), the warnings were caught by catch_warnings. I don't know if it is related or not, but my assumption at the time was that smarter people than I had vetted warnings. =)
msg156763 - (view) Author: Jeff McNeil (mcjeff) * Date: 2012-03-25 17:33
Disregard. That was in concert with ntpath, which uses a funky approach to testing.
msg172843 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-10-14 02:16
Are we still planning on removing URLopener and FancyURLopener in 3.4? The documentation for 3.3 does not list these classes as deprecated.
msg172846 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-10-14 03:41
Only the classes which are marked as deprecated are allowed removed in the next release. So the ones which we marked as deprecated in 3.3 can safely go in 3.4. URLopener and FancyURLopener AFAIR had some dependency/ usage, deprecating those may require some refactoring of the existing code.
msg172874 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-10-14 11:13
Hmm, OK. URLopener and FancyURLopener do each issue a DeprecationWarning when used, though. If they are not actually deprecated, perhaps we should remove the warnings for the moment?
msg172879 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-10-14 13:00
See also #12707.
msg174869 - (view) Author: Akira Li (akira) * Date: 2012-11-05 07:15
Related issue 16409
msg174887 - (view) Author: anatoly techtonik (techtonik) Date: 2012-11-05 10:32
Please note that changes to urlretrieve to return block size 0 in callback breaks existing PyPI apps, such as http://pypi.python.org/pypi/wget It would be nice if you revert this part from http://hg.python.org/cpython/rev/53715804dc71 Also if you change semantics of "block size" (which is assumed to be fixed) to "data read size" then the "count of blocks transferred" losses any sense. Changed callback semantics without renaming function or introducing a different type of callback doesn't add clarity to the code that should be maintained for both Python 2 and Python 3 version. Please add issue #16409 as affected by this bug.
msg175301 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-11-10 23:13
New changeset e19a6194aee4 by Gregory P. Smith in branch '3.3': Fix test_urllib broken by my previous commits. The assumptions it was http://hg.python.org/cpython/rev/e19a6194aee4 New changeset dcf9a07830a6 by Gregory P. Smith in branch 'default': Fix test_urllib broken by my previous commits. The assumptions it was http://hg.python.org/cpython/rev/dcf9a07830a6
msg175314 - (view) Author: Jeff McNeil (mcjeff) * Date: 2012-11-11 04:31
Reverting of the len(block) back to 'bs' here aside, does it even make sense to include block information at all? That's the attempted read size, so it might not be an accurate representation of the size of the data actually read. Thus the reason for the initial change.
msg175317 - (view) Author: Jeff McNeil (mcjeff) * Date: 2012-11-11 04:47
Ah, disregard. I followed up on the other bug. The legacy interface indeed should have stayed consistant.
msg184556 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2013-03-18 23:42
I think, this issue should have been closed. Or at least be closed now. 1. urlretrieve - change was done. 2. Re: - DeprecationWarnings on URLopener have been in place since 3.3. Also in the documentation in the Legacy Interface section, it is mentioned that "They might become deprecated at some point in the future." - Adding an Explicit deprecation in 3.3 is okay. It can even be done now. 3. All other old primitives have been deprecated/documented to be removed.
msg184563 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-03-19 00:07
New changeset ea76cfff5851 by Senthil Kumaran in branch '3.3': #10050 - Document DeprecationWarnings for URLopener and FancyURLopener ( ) http://hg.python.org/cpython/rev/ea76cfff5851 New changeset d4cbd9e2e1cb by Senthil Kumaran in branch 'default': #10050 : merge to default http://hg.python.org/cpython/rev/d4cbd9e2e1cb
History
Date User Action Args
2022-04-11 14:57:07 admin set github: 54259
2013-03-19 00:08:18 orsenthil set status: open -> closedresolution: fixed
2013-03-19 00:07:38 python-dev set messages: +
2013-03-18 23:42:22 orsenthil set messages: +
2012-11-11 04:47:44 mcjeff set messages: +
2012-11-11 04:31:48 mcjeff set messages: +
2012-11-10 23:13:36 python-dev set messages: +
2012-11-05 10:32:36 techtonik set versions: + Python 3.4
2012-11-05 10:32:22 techtonik set nosy: + techtonikmessages: +
2012-11-05 07:15:55 akira set nosy: + akiramessages: +
2012-10-14 13:00:07 ezio.melotti set messages: +
2012-10-14 11:13:12 nadeem.vawda set messages: +
2012-10-14 03:41:10 orsenthil set messages: +
2012-10-14 02:16:27 nadeem.vawda set messages: +
2012-03-25 17:33:42 mcjeff set messages: +
2012-03-25 16:39:21 mcjeff set messages: +
2012-03-25 09:23:45 orsenthil set messages: +
2012-03-25 08:34:35 neologix set nosy: + neologixmessages: +
2012-03-14 21:00:45 python-dev set messages: +
2012-03-14 20:44:04 python-dev set messages: +
2012-03-14 20:37:00 orsenthil set messages: +
2012-03-14 19:06:54 mcjeff set messages: +
2012-03-14 06:33:51 orsenthil set files: + issue10050-deprecation.patchmessages: +
2012-03-14 03:04:31 orsenthil set messages: +
2012-03-14 02:49:37 mcjeff set nosy: + mcjeffmessages: +
2012-03-14 02:32:29 orsenthil set messages: +
2012-03-14 02:29:46 python-dev set nosy: + python-devmessages: +
2012-03-13 08:57:54 mcjeff set nosy: - mcjeff
2011-08-07 23:54:48 ezio.melotti set nosy: + ezio.melotti
2011-03-31 17:57:31 eric.araujo set messages: +
2011-03-30 22:47:56 mcjeff set messages: +
2011-03-22 01:42:38 orsenthil set nosy:jhylton, orsenthil, pitrou, nadeem.vawda, eric.araujo, mcjeffmessages: +
2011-03-22 01:36:44 mcjeff set nosy:jhylton, orsenthil, pitrou, nadeem.vawda, eric.araujo, mcjeffmessages: +
2011-03-20 16:52:54 eric.araujo set nosy:jhylton, orsenthil, pitrou, nadeem.vawda, eric.araujo, mcjeffmessages: +
2011-03-20 16:21:14 mcjeff set nosy:jhylton, orsenthil, pitrou, nadeem.vawda, eric.araujo, mcjeffmessages: +
2011-03-20 13:06:35 pitrou set nosy:jhylton, orsenthil, pitrou, nadeem.vawda, eric.araujo, mcjeffmessages: +
2011-03-20 12:51:40 orsenthil set nosy:jhylton, orsenthil, pitrou, nadeem.vawda, eric.araujo, mcjeffmessages: +
2011-03-20 12:10:13 pitrou set nosy:jhylton, orsenthil, pitrou, nadeem.vawda, eric.araujo, mcjeffmessages: +
2011-03-20 04:30:43 orsenthil set nosy:jhylton, orsenthil, pitrou, nadeem.vawda, eric.araujo, mcjeffmessages: +
2011-03-20 03:03:12 mcjeff set files: + issue10050.patchnosy:jhylton, orsenthil, pitrou, nadeem.vawda, eric.araujo, mcjeffmessages: +
2011-03-19 14:16:50 mcjeff set nosy:jhylton, orsenthil, pitrou, nadeem.vawda, eric.araujo, mcjeffmessages: +
2011-03-19 13:13:08 pitrou set nosy:jhylton, orsenthil, pitrou, nadeem.vawda, eric.araujo, mcjeffmessages: +
2011-03-19 01:47:29 mcjeff set files: + issue10050.patchnosy:jhylton, orsenthil, pitrou, nadeem.vawda, eric.araujo, mcjeffmessages: +
2011-03-19 01:21:31 orsenthil set nosy:jhylton, orsenthil, pitrou, nadeem.vawda, eric.araujo, mcjeffmessages: +
2011-03-19 01🔞14 mcjeff set files: + issue10050.patchnosy:jhylton, orsenthil, pitrou, nadeem.vawda, eric.araujo, mcjeffmessages: +
2011-03-19 00:21:13 orsenthil set nosy:jhylton, orsenthil, pitrou, nadeem.vawda, eric.araujo, mcjeffmessages: +
2011-03-18 23:48:07 eric.araujo set nosy:jhylton, orsenthil, pitrou, nadeem.vawda, eric.araujo, mcjeffmessages: +
2011-03-18 22:57:49 mcjeff set files: + issue10050.patchmessages: + keywords: + patchnosy:jhylton, orsenthil, pitrou, nadeem.vawda, eric.araujo, mcjeff
2011-03-18 17:07:48 nadeem.vawda set nosy: + nadeem.vawda
2011-03-18 02:43:40 mcjeff set nosy: + mcjeff
2010-11-01 21:01:45 eric.araujo set nosy: + eric.araujo
2010-10-08 11:14:29 pitrou create