Adding new IBM extended charsets (original) (raw)
Alan Bateman Alan.Bateman at oracle.com
Sun Aug 5 18:38:45 UTC 2018
- Previous message: 8202794: Native Unix code should use readdir rather than readdir_r
- Next message: Adding new IBM extended charsets
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 24/07/2018 09:56, Nasser Ebrahim wrote:
Thank you Martin, Sherman and Alan for your valuable inputs.
I have done some initial analysis on the ICU4J. There are some compatibility issues on the ICU4J charsets with JDK charsets but am more concerned about its performance as JDK optimization do no exist in that implementation. I think we need to work with the ICU4J community to resolve those issues before we remove those charsets from JDK. If you can work with the ICU4J project on these issues then I think we have a way forward. An additional issue with their downloads is that they target JDK 6 and don't seem to have thought about deploying as modules with JDK 9 or newer yet. Their downloads can be used as automatic modules but it requires renaming their JAR files due to unusual naming that they use to encode the version string. A simple Automatic-Module-Name attribute would make it easy for developers to deploy their charset provider on the module path, they can still target JDK 6.
As regards the way forward then I think we have to put infrastructure into the build to make it easy to allow specific charsets be included or excluded from specific platforms. As things stand, and as have you have found with your updates to the stdcs- files, the charsets are generated to be included in either java.base or jdk.charsets. We need another input to the configurability to make it possible to include or exclude so that the main stream platforms do not have to include the IBM charsets. There are several details around this, particularly around aliases, but if we can get that done then we have a lot of flexibility. My personal view is that we should work towards excluding the IBM charsets from the main stream platforms, starting with a cull of the EBCDIC charsets. If the ICU4J project can get their issues sorted out in a similar time frame then it makes for a simple migration story -- the JDK includes the standard charsets and many additional charsets. If you need others then download the ICU4J charset provider and deploy it on your class path or module path.
-Alan
- Previous message: 8202794: Native Unix code should use readdir rather than readdir_r
- Next message: Adding new IBM extended charsets
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]