[Python-Dev] urllib unicode handling (original) (raw)

Robert Brewer fumanchu at aminus.org
Wed May 7 17:55:34 CEST 2008


"Martin v. Löwis" wrote:

The proper way to implement this would be IRIs (RFC 3987), in particular section 3.1. This is not as simple as just encoding it as UTF-8, as you might have to apply IDNA to the host part.

Code doing so just hasn't been contributed yet.

But if someone wanted to do so, it's pretty simple:

u'www.\u212bngstr\xf6m.com'.encode("idna") 'www.xn--ngstrm-hua5l.com'

Robert Brewer fumanchu at aminus.org



More information about the Python-Dev mailing list