Issue 1643370: recursive urlparse - Python tracker (original) (raw)

Created on 2007-01-24 10:23 by techtonik, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (9)
msg61268 - (view)	Author: anatoly techtonik (techtonik)	Date: 2007-01-24 10:23
urlparse module is incomplete. there is no convenient high-level function to parse url down into atomic chunks, urldecode query and bring it to array (or dictionary for that case), so that you can modify that dictionary and reassemble it into query again using nothing more than simple array manipulations. This kind of function is universal and flexible in the same way that low-level API, but in comparison it allows to considerably speed up development process if the speech is about urls I propose urlparseex(urlstring) function that will dissect the URL into dictionary of appropriate dictionaries or strings and decode all % entities scheme 0 string netloc 1 dictionary username 1.1 string or whatever password 1.2 string or whatever server 1.3 hostname string port 1.4 port integer path 2 string params 3 ordered dictionary of path components for the sake of reassembling them later (sorry, I have little pythons in my head to replace "ordered dictionary" with something more appropriate) where respective path part entry is also dictionary of parameters query 4 dictionary fragment 5 string there must be also counterpart urlunparseex(dictionary) to reassemble url and reencode entities Reasons behind the decision: - 90% of time you need to decode % entities - this must be made by default (whoever need to let them encoded are in minor and may use other functions) - atomic recursion format is needed to be able to easily change any url component and reassemble it back - get simple swiss-army knife for high-level (read - logical) url operations in one module http://docs.python.org/lib/module-urlparse.html There is also this proposal below. It is a little bit different, but shows that after four years url handling problems are still actual. http://sourceforge.net/tracker/index.php?func=detail&aid=600362&group_id=5470&atid=355470
msg99424 - (view)	Author: anatoly techtonik (techtonik)	Date: 2010-02-16 17:41
The last SF link is issue 600362
msg108959 - (view)	Author: Senthil Kumaran (orsenthil) *	Date: 2010-06-30 02:47
This is already handled via namedtuple in the urlparse. All the parts of the url are available by parsing.
msg108962 - (view)	Author: anatoly techtonik (techtonik)	Date: 2010-06-30 03:44
Senthil, please read the proposals more attentively. From the docs of urlparse at http://docs.python.org/library/urlparse.html "The components are not broken up in smaller parts (for example, the network location is a single string), and % escapes are not expanded."
msg108972 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2010-06-30 10:58
Since no patch has been proposed since 2007, I think it is time to close this feature request for lack of interest. In any case I think this functionality would be better situated in a Python3 URI/IRI parsing module with a full object model for the IRI, which is something complicated enough that it may need some time on PyPI before getting in to the standard library.
msg109064 - (view)	Author: Senthil Kumaran (orsenthil) *	Date: 2010-07-01 18:19
David, Is the stage "unit test needed" proper for this or was it by mistake? Anatoly, I thought closing this feature request was fine, because I considered that with namedtuple the desired attributes of url's were obtained as ParsedTuple object (check test_urlsplit_attributes in test_urlparse.py). But as you pointed out, I can see that the docs can be improved further. Your suggested approach of dictionary is bit different than the way it is currently implemented, a patch might have helped for evaluation.
msg109073 - (view)	Author: anatoly techtonik (techtonik)	Date: 2010-07-01 20:58
Too bad that request from users who are not eligible to produce a patch are not accepted by Python "community". =/
msg109076 - (view)	Author: Georg Brandl (georg.brandl) *	Date: 2010-07-01 21:31
Why shouldn't you be eligible to produce patches to Python? And yes, requests without patches will sometimes take longer, or be evaluated differently, since we're all volunteers here, and an existing patch, even if unusable it the submitted form, often makes working on a request much more straightforward. Regarding your ironic quoting of the word "community" -- do not forget that you are part of the community, and what we are doing here is exactly what a community does as compared to a company: helping each other, not because of payment, but because we care for what we do. Please do not subvert that commitment.
msg109078 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2010-07-01 21:37
Anatoly, when I said I was closing the issue for lack of interest, I meant that you had not produced a candidate patch, and no one else had shown any interest in creating one. If you wish to produce a candidate patch we can reopen the issue (though I do think a full blow URI/IRI module would be better).

History
Date	User	Action	Args
2022-04-11 14:56:22	admin	set	github: 44500
2010-07-01 21:37:29	r.david.murray	set	messages: +
2010-07-01 21:33:36	eric.araujo	set	nosy: + eric.araujo
2010-07-01 21:31:49	georg.brandl	set	nosy: + georg.brandlmessages: +
2010-07-01 20:58:53	techtonik	set	messages: +
2010-07-01 18:19:24	orsenthil	set	messages: +
2010-06-30 10:58:31	r.david.murray	set	status: open -> closednosy: + r.david.murraymessages: + resolution: out of date -> stage: resolved -> test needed
2010-06-30 03:44:34	techtonik	set	status: closed -> openmessages: +
2010-06-30 02:47:38	orsenthil	set	status: open -> closedresolution: out of datemessages: + stage: test needed -> resolved
2010-02-16 17:41:18	techtonik	set	messages: +
2009-05-01 11:04:57	orsenthil	set	nosy: + orsenthil
2009-04-22 17:25:38	ajaksu2	set	keywords: + easy
2009-02-13 01:36:42	ajaksu2	set	nosy: + jjleestage: test neededversions: + Python 3.1, Python 2.7, - Python 2.6
2007-01-24 10:23:00	techtonik	create