[Python-Dev] urllib.quote and unquote - Unicode issues (original) (raw)
Antoine Pitrou solipsis at pitrou.net
Wed Aug 6 18:55:51 CEST 2008
- Previous message: [Python-Dev] urllib.quote and unquote - Unicode issues
- Next message: [Python-Dev] urllib.quote and unquote - Unicode issues
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Martin v. Löwis <martin v.loewis.de> writes:
URLs are just not made for non-ASCII characters.
Perhaps they are not, but every non-English wiki (just to take a simple, generic example) potentially contains non-ASCII URLs. e.g. http://fr.wikipedia.org/wiki/%C3%89l%C3%A9phant http://wiki.python.org/moin/J%C3%BCrgenHermann (notice the utf-8 encoding in both)
Implement IRIs if you want non-ASCII characters; the rules are much clearer for these.
I think most people would expect something which works with the current World Wide Web rather than a rigorous implementation of a specific RFC. Implementing RFCs is fine but it does not magically eliminate all problems, especially when the RFCs themselves are not in sync with real-world usage.
Regards
Antoine.
- Previous message: [Python-Dev] urllib.quote and unquote - Unicode issues
- Next message: [Python-Dev] urllib.quote and unquote - Unicode issues
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]