RFR (JAXP): 8035469 : Xerces Update: EncodingMap does not recognise Java-style encodings Cp1141-Cp1149 (original) (raw)
David Li david.x.li at oracle.com
Sat Mar 1 18:12:10 UTC 2014
- Previous message: RFR (JAXP): 8035469 : Xerces Update: EncodingMap does not recognise Java-style encodings Cp1141-Cp1149
- Next message: RFR (JAXP): 8035469 : Xerces Update: EncodingMap does not recognise Java-style encodings Cp1141-Cp1149
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Joe probably knows more about this, but we did some preliminary investigation summarized below.
One test that was considered was creating an XML file encoded in one of the formats and then seeing if the parser would process the file after our updates were added. This looked like it requires generating sample XML files with characters from the actual encoding, which we could not figure out in a reasonable amount of time. It's not sufficient to specify the encoding in the XML header (, also tried IBM01140) if all the text in the file is UTF-8, since the parser complains. It was decided that since the changes were minor, and the original Xerces bug did not include any tests or any way of reproducing the error, we would not spend too much time on the issue. For reference, the IBM01140-IBM01149 encodings look like various European languages: http://www.iana.org/assignments/character-sets/character-sets.xhtml.
- David
On 3/1/2014 1:06 AM, Alan Bateman wrote:
On 28/02/2014 22:11, David Li wrote:
Hi,
This is an update from Xerces for a fixed encoding map entry in file EncodingMap.java. For details, please refer to: https://bugs.openjdk.java.net/browse/JDK-8035469 Webrevs: http://cr.openjdk.java.net/~joehw/jdk9/8035469/webrev/ (I don't have a openjdk username yet, so Joe Wang uploaded it) No new tests since the change is minor. There were no tests from Apache fixes. Maybe this is a question for Joe but I wonder if it would be possible to create a test that exercises these encodings? I realize the change is minor but it is also subtle and this maybe be an area where we should have better tests. -Alan
- Previous message: RFR (JAXP): 8035469 : Xerces Update: EncodingMap does not recognise Java-style encodings Cp1141-Cp1149
- Next message: RFR (JAXP): 8035469 : Xerces Update: EncodingMap does not recognise Java-style encodings Cp1141-Cp1149
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]