Charsets for hex/base64 (original) (raw)
David Lloyd david.lloyd at redhat.com
Thu May 3 12:56:08 UTC 2018
- Previous message: Charsets for hex/base64
- Next message: RFR(XS): 8202329 [AIX] Fix codepage mappings for IBM-943 and Big5
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I feel like what you're looking for is really a general-purpose data transformation API. I think bending charsets into this shape would be a bad move, but I like the idea of a more generalized solution. Google Guava has such an abstraction, and I know of a couple of others a well.
On Thu, May 3, 2018 at 4:07 AM, Jonas Konrad <me at yawk.at> wrote:
Well technically there is some sort of precedent to this, since CharsetEncoder/Decoder operate on CharBuffers which are just utf-16 encoded strings. So charsets already may produce a single output code unit for multiple input code units (UTF-32, which may output 1 code unit for 2 UTF-16 input code units / chars). Of course, consuming multiple code points would be new but code points aren't really part of the CharBuffer api.
- Jonas
On 05/02/2018 05:29 PM, Weijun Wang wrote:
On May 2, 2018, at 4:35 PM, Jonas Konrad <me at yawk.at> wrote: "0a0b0c".getBytes(HexCharset.getInstance()) = new byte[] { 0x0a, 0x0b, 0x0c } new String(new byte[] { 0x0a, 0x0b, 0x0c }, HexCharset.getInstance()) = "0a0b0c" Normally a charset is to encode a string to byte[], but here you can actually decoding a string to byte[]. This would lead to quite some concept differences. For example, we can say if a char is encodable for a charset, but for the HEX "charset", you will have to say what combination of chars is "encodable". --Max
--
- DML
- Previous message: Charsets for hex/base64
- Next message: RFR(XS): 8202329 [AIX] Fix codepage mappings for IBM-943 and Big5
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]