Warnings Cleanup in java.util. (more from hack day) (original) (raw)

Martijn Verburg martijnverburg at gmail.com
Sat Dec 3 03:05:23 PST 2011

Previous message: Warnings Cleanup in java.util. (more from hack day)
Next message: Warnings Cleanup in java.util. (more from hack day)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Sherman,

In order to keep this change within the scope of the intentions of the exercise I'm going to revert that section to what it was (I'll re-spin a patch). At this stage I won't add a @SuppressWarnings as I think this should be avoidable once it's looked at again in a little more depth.

The rest follows in-line. If this is starting to chew up your time then please don't feel obliged to answer, this is more for my own curiosities sake. This also isn't a suggestion to change the patch again, just theorising here :-)

On Friday, 2 December 2011, Xueming Shen <xueming.shen at oracle.com> wrote:

Martijn,

The proposed change is incorrect. + value = new String(vb, 0, 0, StandardCharsets.UTF8); First, shouldn't it at least be value = new String(vb, StandardCharsets.UTF8); or value = new String(vb, 0, vb.length, StandardCharsets.UTF8); Second, the "value" will be written out via dos.writeBytes(String) later, which only takes the low-byte of a each char of the target String object.

Right, so the need for the hi-byte constructor can be avoided, but by me ignoring the vb.length I introduced a bug, gotcha.

So the code was:

String value = e.getKey(); if (value != null) { byte[] vb = value.getBytes("utf8"); value = new String(vb, 0, 0, vb.length); }

My proposed (bad) change was:

String value = e.getKey(); if (value != null) { byte[] vb = value.getBytes(StandardCharsets.UTF_8); value = new String(vb, 0, 0, StandardCharsets.UTF_8); }

As you say, a better change would have been:

String value = e.getKey(); if (value != null) { byte[] vb = value.getBytes(StandardCharsets.UTF_8); value = new String(vb, 0, vb.length, StandardCharsets.UTF_8); }

Or in a more performing manner:

String value = e.getKey(); if (value != null) { // As at jdk8 build X - we don't use StandardCharsets.UTF_8 in getBytes for performance reasons byte[] vb = value.getBytes("utf8"); value = new String(vb, 0, vb.length, StandardCharsets.UTF_8); }

That "not properly" conversion is actually what we need here to make sure the output will be correctly in utf8. The only "reliable" replacement might be to use "iso8859-1" (not utf8), but I would recommend keep it un-touched.

Fair enough.

Thanks for clearing that up and apologies for causing pain :(

Cheers, Martijn

Previous message: Warnings Cleanup in java.util. (more from hack day)
Next message: Warnings Cleanup in java.util. (more from hack day)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the jdk8-dev mailing list