Adding new IBM extended charsets (original) (raw)
Nasser Ebrahim enasser at in.ibm.com
Wed Aug 22 18:19:02 UTC 2018
- Previous message: Adding new IBM extended charsets
- Next message: [httpclient] HTTP2: Memory Leak with Proxy
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Alan,
Thank you for your valuable inputs. I will initiate the discussion with ICU4J community to explore the possibility of using ICU4J by resolving the compatibility and performance difference so that we can use ICU4J for most of the extended charsets and remove them JDK build. As we discussed earlier, significant changes are required on ICU4J side to resolve the functional and performance difference for JDK to directly consume it and hence may be considered as a long term solution.
In the mean time, I can explore the other option you have suggested to make the IBM charsets specific to AIX platform and keep optional for other platforms by making the make file changes. I will try to create a prototype to do the make/src file changes which enable generating IBM charsets as a separate module only on AIX platform and keep optional for other platforms.
Please let me know if you have any inputs.
Thank you, Nasser Ebrahim
From: Alan Bateman <Alan.Bateman at oracle.com> To: Nasser Ebrahim <enasser at in.ibm.com>, core-libs-dev at openjdk.java.net, Xueming Shen <xueming.shen at oracle.com> Date: 08/06/2018 12:08 AM Subject: Re: Adding new IBM extended charsets
On 24/07/2018 09:56, Nasser Ebrahim wrote: Thank you Martin, Sherman and Alan for your valuable inputs.
I have done some initial analysis on the ICU4J. There are some compatibility issues on the ICU4J charsets with JDK charsets but am more concerned about its performance as JDK optimization do no exist in that implementation. I think we need to work with the ICU4J community to resolve those issues before we remove those charsets from JDK. If you can work with the ICU4J project on these issues then I think we have a way forward. An additional issue with their downloads is that they target JDK 6 and don't seem to have thought about deploying as modules with JDK 9 or newer yet. Their downloads can be used as automatic modules but it requires renaming their JAR files due to unusual naming that they use to encode the version string. A simple Automatic-Module-Name attribute would make it easy for developers to deploy their charset provider on the module path, they can still target JDK 6.
As regards the way forward then I think we have to put infrastructure into
the build to make it easy to allow specific charsets be included or
excluded from specific platforms. As things stand, and as have you have
found with your updates to the stdcs- files, the charsets are
generated to be included in either java.base or jdk.charsets. We need
another input to the configurability to make it possible to include or
exclude so that the main stream platforms do not have to include the IBM
charsets. There are several details around this, particularly around
aliases, but if we can get that done then we have a lot of flexibility.
My personal view is that we should work towards excluding the IBM charsets
from the main stream platforms, starting with a cull of the EBCDIC
charsets. If the ICU4J project can get their issues sorted out in a
similar time frame then it makes for a simple migration story -- the JDK
includes the standard charsets and many additional charsets. If you need
others then download the ICU4J charset provider and deploy it on your
class path or module path.
-Alan
- Previous message: Adding new IBM extended charsets
- Next message: [httpclient] HTTP2: Memory Leak with Proxy
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]