Issue 29824: Hostname validation in SSL match_hostname() (original) (raw)
Allowing attempting to match invalid hostname According to domain name specification in RFC 1035, only alphanumeric, dot and hyphen are valid characters in domain name. We observe that the function match_hostname() in Lib/ssl.py allows other special characters (e.g., '=', '&') in hostname when attempting to match with certificate commonName (CN)/subjectAltName DNS. An example would be matching hostname "example.a=.com" with certificate CN/DNS "example.a=.com" or CN/DNS "*.a=.example.com". Ensuring that CN/DNS with invalid characters are rejected, will make the library more robust against attacks that utilize such characters.
Matching wildcard in public suffix As noted in section 7.2 of RFC 6125, some wildcard location specifications are not clear. We found that the function allows wildcard over public suffix in certificate as well as allows attempting to match in hostname verification, e.g., matches hostname "google.com" and "example.com" with certificate CN/DNS ".com". This is not an RFC violation, but we might benefit from implementing the check, for example ".one_label" is restricted. A better option will be having a list of all TLD's and check against it.
Thanks.
I don't see 1) as a problem. You won't be able to resolve these names in DNS, would you?
Regarding 2). Yes, it would be beneficial to have more elaborate checks to protect against wildcard attacks like *.com. However Python is not a browser. It's really hard to do it right and even harder to keep the rule set up to date. Some TLDs like .uk have sublevel namespaces, e.g. co.uk. *.co.uk is also invalid.
The problem is going to shift anyway. For Python 3.7 I'm going to deprecate support for OpenSSL < 1.0.2 and use OpenSSL's hostname verification code instead of ssl.match_hostname().