Illegal attributes that begin with = (original) (raw)
If we parse an attribute like <test =foo=bar/> Then in the DOM the attribute appears with the = sign, but when re-serialized it is generated without.
Code:
val doc = Jsoup.parse("<test =foo=\"bar\"/>")
for (elem in doc.select("test")) {
for (attr in elem.attributes()) {
println(attr.key)
}
}
println(doc.html())
Output:
=foo
<html>
<head></head>
<body>
<test foo="bar" />
</body>
</html>
This is problematic as if an application is doing validation on the key, to prevent XSS attacks, this can be a way to bypass the validation. I discovered this issue (in a lab environment, not a live app) just now.
The key for output can be accessed using getValidKey(). A potential solution to this is to normalise keys during parsing.