Loading... (original) (raw)

SHORT SUMMARY:
Unmappable single byte characters are incorrectly handled in multi-byte
encodings
INDICATORS:
Incorrect decoding performed on multi-byte encoding for cases
where un-defined single-bytecode points are encountered

COUNTER INDICATORS:
TRIGGERS:
Issue in JDK 6 (and later) since 6227339 was fixed (6 GA)
KNOWN WORKAROUND: N/A
PRESENT SINCE: JDK 6 FCS
HOW TO VERIFY:
Testcase attached
NOTES FOR SE:
Suggestion as per Dev engineer comments :
The problem of our implementation is that currently if it is not a "valid"
single-byte,
it assumes it's a double-byte, and if it's a double-byte, and if it can't be
mapped to
anything, treat it as a "un-mappable" double-byte., this is a wrong
assumption for
this kind of "undefined" code point. The bottom line is that we should skip
2, for
un-mappable double-byte character, but skip only 1, for un-defined
single-byte
code point.

REGRESSION: Yes.