China is now blocking all language editions of Wikipedia (original) (raw)

iyouport.org, Open Culture Foundation (OCF), Sukhbir Singh (Open Web Fellow, Mozilla Foundation), Arturo Filastò (OONI), Maria Xynou (OONI) 2019-05-04

Translation(s):

translation: 中国封锁了所有语言版本的维基百科

China recently started blocking all language editions of Wikipedia. Previously, the blocking was limited to the Chinese language edition of Wikipedia (zh.wikipedia.org), but has now expanded to include all *.wikipedia.org language editions.

In this post, we share OONI network measurement data on the blocking of Wikipedia in China. We found that all wikipedia.org sub-domains are blocked in China by means of DNS injection and SNI filtering.

DNS injection

Through the use of OONI Probe, Wikipedia domains have been tested from multiple local vantage points in China since 2015. Most measurements have been collected from China Telecom (AS4134).

OONI’s Web Connectivity test (available in the OONI Probe apps) is designed to measure the TCP/IP, HTTP, and DNS blocking of websites. Network measurement data collected through this test has shown that most Wikipedia language editions were previously accessible in China, except for the Chinese edition, which has reportedly been blocked since 19th May 2015.

OONI data shows that China Telecom (AS4134) has been blocking zh.wikipedia.org since at least the 10th November 2016 (previous OONI measurements show that zh.wikipedia.org was accessible in March 2015 on that network).

The following chart, based on OONI data, illustrates that multiple language editions of Wikipedia have been blocked in China as of April 2019.

**Source:**Blocking of Wikipedia domains in China, Open Observatory of Network Interference (OONI) data: China, https://api.ooni.io/files/by_country/CN

Our analysis of OONI measurements, used to produce the above chart, is available here.

OONI measurements show that many of these Wikipedia domains were previously accessible, but all measurements collected from 25th April 2019 onwards present the same DNS anomalies for all Wikipedia sub-domains. The few DNS anomalies that occurred in previous months were false positives, whereas the DNS anomalies from April 2019 onwards show that Wikipedia domains are blocked by means of DNS injection. Most measurements were collected from China Telecom (AS4134).

Since OONI measurements collected from China suggest blocking by means of DNS injection, we can further measure the DNS-based blocking from outside of China as well. To this end, we ran the OONI Probe DNS injection test from a vantage point outside of the country, pointing towards an IP address in China.

This test relies on the fact that the Chinese firewall will “inject” DNS requests for restricted domains, even if the request is coming from outside the country and directed at an IP address which does not run a DNS resolver. The expectation was, therefore, that if the DNS query timed out, no blocking was happening, but if we saw a response, then that response was injected by the censor.

The OONI Probe DNS injection test is very fast. It allowed us to scan more than 2,000 Wikipedia domain names in less than a minute and to determine which ones were blocked.

By analyzing the results of the OONI Probe DNS injection test, we were able to understand that the restriction appears to be targeting any subdomain/language edition of wikipedia.org (i.e. *.wikipedia.org, zh.wikipedia.org, en.wikipedia.org, etc.) - including wikipedia.org - but to not be affecting any other Wikimedia resources, beyond zh.wikinews.org.

The blocking appears to be targeting wikipedia.org subdomains irrespective of whether they actually exist or not (for example, even doesnotexist.wikipedia.org was blocked!). The IP address returned in the injected DNS response also appears to be pretty random (examples of prior work analyzing the distribution of IP addresses returned by the Great Firewall include “The Great DNS Wall of China” and “Towards a Comprehensive Picture of the Great Firewall’s DNS Censorship”).

SNI filtering

To check whether the blocking of Wikipedia domains could be circumvented by merely encrypting DNS traffic, we attempted to enable DNS over HTTPS in Firefox.

To this end, we ran:

curl -H 'accept: application/dns-json' https://cloudflare-dns.com/dns-query?name=www.wikipedia.org&type=A

We were able to resolve the www.wikipedia.org domain name successfully with DNS over HTTPS.

These tests were also validated by enabling DNS over HTTPS inside of Firefox.

Yet, the page was still not accessible.

We were only able to access the bare IP address from China, indicating that SNI filtering may be in place.

To further validate the theory that filtering was happening based on SNI filtering, we ran the following curl tests (we ran similar tests in Venezuela to confirm the same hypothesis):

$ curl -v --connect-to ::www.kernel.org: https://www.wikipedia.org

* Rebuilt URL to: https://www.wikipedia.org/
* Connecting to hostname: www.kernel.org
*   Trying 147.75.46.191...
* TCP_NODELAY set
*   Trying 2604:1380:4080:c00::1...
* TCP_NODELAY set
* Immediate connect fail for 2604:1380:4080:c00::1: 网络不可达
* Connected to www.wikipedia.org (147.75.46.191) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* Unknown SSL protocol error in connection to www.wikipedia.org:443
* Curl_http_done: called premature == 1
* stopped the pause stream!
* Closing connection 0
curl: (35) Unknown SSL protocol error in connection to www.wikipedia.org:443

The above curl test is connecting to www.kernel.org (IP 147.75.46.191), but attempting to do a TLS handshake using the SNI of www.wikipedia.org. As we can see from the output above, as soon as the TLS handshake, Client hello is sent, the connection is aborted.

Conversely, as seen below, if we attempt to use the SNI of www.kernel.org when doing a TLS handshake with www.wikipedia.org (we use the --resolve option to skip the DNS resolution), the request is successful and we are able to finish the TLS handshake.

$ curl -v --resolve 'www.wikipedia.org:443:91.198.174.192' --connect-to ::www.wikipedia.org: https://www.kernel.org

* Added www.wikipedia.org:443:91.198.174.192 to DNS cache
* Rebuilt URL to: https://www.kernel.org/
* Connecting to hostname: www.wikipedia.org
* Hostname www.wikipedia.org was found in DNS cache
*   Trying 91.198.174.192...
* TCP_NODELAY set
* Connected to www.kernel.org (91.198.174.192) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-ECDSA-AES256-GCM-SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: C=US; ST=California; L=San Francisco; O=Wikimedia Foundation, Inc.; CN=*.wikipedia.org
*  start date: Nov  8 21:21:04 2018 GMT
*  expire date: Nov 22 07:59:59 2019 GMT
*  subjectAltName does not match www.kernel.org
* SSL: no alternative certificate subject name matches target host name 'www.kernel.org'
* Curl_http_done: called premature == 1
* stopped the pause stream!
* Closing connection 0
* TLSv1.2 (OUT), TLS alert, Client hello (1):
curl: (51) SSL: no alternative certificate subject name matches target host name 'www.kernel.org'

Based on these tests, we were able to conclude that China Telecom does in fact block all language editions of Wikipedia by means of both DNS injection and SNI filtering.

Similarly to censorship implemented in Egypt, perhaps this can be viewed as a “defense in depth” tactic for network filtering. By implementing both DNS and SNI-based filtering, China Telecom creates multiple layers of censorship that make circumvention harder.

The use of an encrypted DNS resolver (such as DNS over HTTPS) together with Encrypted SNI (ESNI) could potentially work as a circumvention strategy. Wikipedia.org does not currently support ESNI, but there have been discussions about enabling it.