[Python-Dev] [ssl] The weird case of IDNA (original) (raw)

Stephen J. Turnbull turnbull.stephen.fw at u.tsukuba.ac.jp
Sat Dec 30 10:26:34 EST 2017


Christian Heimes writes:

tl;dr This mail is about internationalized domain names and TLS/SSL. It doesn't concern you if you live in ASCII-land. Me and a couple of other developers like to change the ssl module in a backwards-incompatible way to fix IDN support for TLS/SSL.

Yes please!

Seriously, we need to fix the bug for German, and I would presume other languages that have used pure-ASCII transcodings, which I bet are in very common use in domain names.

Do you have an issue # for this offhand? If not I'll just go dig it out for myself.

In a perfect world, it would be very simple. We'd only had one IDNA standard. However there are multiple standards that are incompatible with each other.

You forgot the obligatory XKCD: https://www.xkcd.com/927. ;-)

The German TLD .de demands IDNA-2008 with UTS#46 compatibility mapping. The hostname 'www.straße.de' maps to 'www.xn--strae-oqa.de'. However in the older IDNA 2003 standard, 'www.straße.de' maps to 'www.strasse.de', but 'strasse.de' is a totally different domain!

That's a mess! I bet the domain squatters have had a field day.

Questions:

That's not quite true, as your German example shows. In some Oriental renderings it is impossible to distinguish halfwidth digits from full-width ones as the same glyphs are used. (This occasionally happens with other ASCII characters, but users are more fussy about digits lining up.) That is, while technically ASCII-only domain names are not affected, users of ASCII-only domain names are potentially vulnerable to confusable names when IDNA is introduced. (Hopefully the Asian registrars are as woke as the German ones! But you could still register a .com containing full-width digits or letters.)

and IDNA users are broken anyway.

Agree with your analysis, except for the fine point above. Japanese don't use IDNA much yet (except like the WIDE folks, who know what they're doing), so I have little experience with potential breakage. On the other hand that suggests that transitioning quickly will be helpful.

3.7 has a lot of new stuff in it. I suspect a lot of people are going to take their time moving production sites to it, so +1 on a backport. 3.5 too, if it's not too hard.



More information about the Python-Dev mailing list