Message 62787 - Python tracker (original) (raw)
The solution of adding caching, If-Modified-Since, etc. is a good one, but I quail in fear at the prospect of expanding the saxutils resolver into a fully caching HTML agent that uses a cache across processes. We should really be encouraging people to use more capable libraries such as httplib2 (http://code.google.com/p/httplib2/), but this is slightly at war with the batteries-included philosophy.
So, I propose we:
- add warnings to the urllib, urllib2, saxutil module docs that parsing can retrieve arbitrary resources over the network, and encourage the user to use a smarter library such as httplib2.
- update the urllib2 HOWTO to mention this.
I'm willing to do the necessary writing.