Code review request: 6961765: Double byte characters corrupted in DN for LDAP referrals (original) (raw)
Vincent Ryan vincent.x.ryan at oracle.com
Tue Mar 6 11:55:11 UTC 2012
- Previous message: Code review request: 6961765: Double byte characters corrupted in DN for LDAP referrals
- Next message: hg: jdk8/tl/langtools: 2 new changesets
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Your fix looks fine.
On 03/ 6/12 08:32 AM, Weijun Wang wrote:
Hi Vinnie
This bug is about using UrlUtil.decode() to decode a URL that is not fully encoded, i.e. including non-ASCII characters. The webrev is at http://cr.openjdk.java.net/~weijun/6961765/webrev.00/ It simply delegates the call to URLDecoder.decode(). LDAP URL (RFC 4516 2.1) specifies that only , , and chars can be used, which do not include general non-ASCII unicode. So precisely the user input in the bug report is illegal, but since it's already a valid URL/URI in Java, we can somehow be more friendly. In fact, the javadoc of URLDecoder [1] also only allows these characters, but at the same time it says -- There are two possible ways in which this decoder could deal with illegal strings. It could either leave illegal characters alone or it could throw an IllegalArgumentException. Which approach the decoder takes is left to the implementation. Now the Oracle implementation of the class "leave illegal characters alone". In this sense, UrlUtil is not as good as URLDecoder. It neither leaves them alone nor throws an exception. To be more correct, I think we can update URLDecoder so that it leaves Unicode in the "other" category (non-control, non-whitespace non-ASCII Unicode chars, as described in URI's spec) unchanged, and throw an exception otherwise (that is, non-ASCII, and control or space). But I'll leave that to another RFE. Thanks Max
-------- Original Message -------- Change Request ID: 6961765 Synopsis: Double byte characters corrupted in DN for LDAP referrals === Description ============================================================ SYNOPSIS -------- Double byte characters corrupted in DN for LDAP referrals OPERATING SYSTEM ---------------- All FULL JDK VERSION ---------------- All DESCRIPTION ----------- If the DN component of an LDAP URL contains double byte characters, it is corrupted by com.sun.jndi.toolkit.url.UrlUtil.decode(). This corruption leads to application level failures. Consider the following scenario: 1. Application connects to an LDAP server and searches for the string uid=???,??? (where ??? are double byte characters) 2. JNDI code receives a referral, for example: ldap://www.test.com/uid=???,???,ou=people,ou=test,ou=test,o=test 3. The referral is then parsed to split the hostname, port number and the DN element of the URI via com.sun.jndi.ldap.LdapURL.parsePathAndQuery() 4. The DN element is decoded using com.sun.jndi.toolkit.url.UrlUtil.decode() 5. This method expects the characters to be ASCII. If the characters are non-ASCII, as in our example, then those characters are not converted properly. 6. This corrupted DN is then passed to the LDAP server, resulting in an unexpected failure. TESTCASE -------- This testcase does not represent normal application code. It highlights the problem by calling into com.sun.* internal classes directly. This allows the problem to be demonstrated without setting up an LDAP server. import java.net.URI; import java.net.URLDecoder; import com.sun.jndi.ldap.LdapURL; public class LdapURLTest { public static void main (String args[]) throws Exception { String testString = ("ldap://www.test.com/uid=\u3070\u3073\u3076,\u3079\u307C\u307E,ou=test,ou=test,ou=test,o=test"); LdapURL ldURL = new LdapURL(testString); System.out.println(" LDAP URL String: " + testString); System.out.println(" decoded DN: " + ldURL.getDN()); // suggested fix demonstration String DN; String path = new URI(testString).getPath(); DN = path.startsWith("/") ? path.substring(1) : path; String proposedDN = URLDecoder.decode(DN, "UTF8"); System.out.println("\nDN from proposed fix: " + proposedDN); } } SUGGESTED FIX ------------- Use java.net.URLDecoder rather than com.sun.jndi.toolkit.url.UrlUtil to conduct the URL decoding in parsePathAndQuery(). Specifically, change the line that decodes the DN element in com.sun.jndi.ldap.LdapURL.parsePathAndQuery() from: DN = path.startsWith("/") ? path.substring(1) : path; if (DN.length() > 0) { --> DN = UrlUtil.decode(DN, "UTF8"); <--_ _}_ _to:_ _DN = path.startsWith("/") ? path.substring(1) : path;_ _if (DN.length() > 0) { --> DN = URLDecoder.decode(DN, "UTF8"); <--_ _}_ _=== *Evaluation*_ _=============================================================_ _The URL in the testcase has an invalid encoding. Its Unicode characters_ _must be encoded in UTF-8. For example,_ _\u3070 -> \e3\81\b0 -> %5Ce3%5C81%5Cb0
- Previous message: Code review request: 6961765: Double byte characters corrupted in DN for LDAP referrals
- Next message: hg: jdk8/tl/langtools: 2 new changesets
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]