Warnings Cleanup in java.util. (more from hack day) (original) (raw)
Martijn Verburg martijnverburg at gmail.com
Sat Dec 3 03:05:23 PST 2011
- Previous message: Warnings Cleanup in java.util. (more from hack day)
- Next message: Warnings Cleanup in java.util. (more from hack day)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Sherman,
In order to keep this change within the scope of the intentions of the exercise I'm going to revert that section to what it was (I'll re-spin a patch). At this stage I won't add a @SuppressWarnings as I think this should be avoidable once it's looked at again in a little more depth.
The rest follows in-line. If this is starting to chew up your time then please don't feel obliged to answer, this is more for my own curiosities sake. This also isn't a suggestion to change the patch again, just theorising here :-)
On Friday, 2 December 2011, Xueming Shen <xueming.shen at oracle.com> wrote:
Martijn,
The proposed change is incorrect. + value = new String(vb, 0, 0, StandardCharsets.UTF8); First, shouldn't it at least be value = new String(vb, StandardCharsets.UTF8); or value = new String(vb, 0, vb.length, StandardCharsets.UTF8); Second, the "value" will be written out via dos.writeBytes(String) later, which only takes the low-byte of a each char of the target String object.
Right, so the need for the hi-byte constructor can be avoided, but by me ignoring the vb.length I introduced a bug, gotcha.
So the code was:
String value = e.getKey(); if (value != null) { byte[] vb = value.getBytes("utf8"); value = new String(vb, 0, 0, vb.length); }
My proposed (bad) change was:
String value = e.getKey(); if (value != null) { byte[] vb = value.getBytes(StandardCharsets.UTF_8); value = new String(vb, 0, 0, StandardCharsets.UTF_8); }
As you say, a better change would have been:
String value = e.getKey(); if (value != null) { byte[] vb = value.getBytes(StandardCharsets.UTF_8); value = new String(vb, 0, vb.length, StandardCharsets.UTF_8); }
Or in a more performing manner:
String value = e.getKey(); if (value != null) { // As at jdk8 build X - we don't use StandardCharsets.UTF_8 in getBytes for performance reasons byte[] vb = value.getBytes("utf8"); value = new String(vb, 0, vb.length, StandardCharsets.UTF_8); }
That "not properly" conversion is actually what we need here to make sure the output will be correctly in utf8. The only "reliable" replacement might be to use "iso8859-1" (not utf8), but I would recommend keep it un-touched.
Fair enough.
Thanks for clearing that up and apologies for causing pain :(
Cheers, Martijn
- Previous message: Warnings Cleanup in java.util. (more from hack day)
- Next message: Warnings Cleanup in java.util. (more from hack day)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]