java-nio-charset-enhanced -- Milestone 4 is released (original) (raw)

Martin Buchholz martinrb at google.com
Sun Mar 29 18:27:26 UTC 2009


On Fri, Mar 27, 2009 at 15:44, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:

Am 27.03.2009 22:49, Martin Buchholz schrieb:

Again, Ulf, I love the sort of stuff you're doing. Much thanks again for the flowers. :-) I hope to be able to contribute some enginering to your effort myself someday. In the meantime, we need some infrastructure to guarantee that the behavior of the charsets is completely unchanged as we optimize. I have some code left behind at Sun to do that, i.e. compare different JDKs w.r.t charset compatibility. Hopefully Sun engineers can resurrect that code and perhaps put it into a public mercurial repo somewhere. Another approach is to take the code in tests like my Find{En,De}coderBugs.java tests which compare direct vs. regular buffers, and retarget it to compare two different jdks. I also have coded such a test for full-scan comparision: See CharsetsTest + LegacyCharset (it retrieves the legacy charsets by reflection directly from rt.jar of the patched JDK) here: https://java-nio-charset-enhanced.dev.java.net/source/browse/java-nio-charset-enhanced/trunk/test/sun/nio/cs/ It cost me several nights having all code points equal, faced to my special mixture of range-limited direct maps and full-range indirected map.

It does look like you've written a lot of good tests. It would be nice not to have an explicit list of charsets in CharsetsTest.java.PARAMETERS. I guess it's a list of charsets subject to single-byte testing? If so, better documentation would be good. Charsets named ISO-8859-* are guaranteed to be single-byte, it might be good to include those programmatically, by filtering Charsets.availableCharsets(). Why include EUC-JP but not UTF-8?

It's probably still a good idea to get inspiration from my Find*Bugs tests which test many other things like complete compatibility of exceptions in case of invalid input.

It's too difficult to give credit to external contributors. One problem is that the Contributed-by: line is a red flag to lawyers and other folks that might cause the legality of the change to be questioned without end.  Let's try to get Ulf a proper commit bit and make sure the legal questions come to an end.

Aren't "Contributed-by" and "author" comments usual practice in open source products? Even in Sun's JRL source author was mentioned. I think, the lawyer guys and girls from Sun should rethink that subject. Ok, we will see ...

The problem is more human. One would like to give credit for good ideas or good analysis, but the only official way to give credit in a commit message is via a simple Contributed-by: email-address which raises legal doubts even when there is no copyrighted material. I guess one can abuse the Summary: field to squeeze in thank-yous, but it's pretty obvious that you are circumventing the process.

Martin



More information about the core-libs-dev mailing list