Loading... (original) (raw)

There are a variety of APIs in the libraries that convert byte[] to String and vice-versa, thus requiring use of a charset. Some APIs don't take any charset-related parameters at all, relying on the platform default charset or on a specified default (UTF-8). Some APIs take a charset name parameter (a String) which is inconvenient for a variety of reasons. All of these cases should have overloads added that take a Charset object as a parameter.

For example, these APIs take a charset name:

ByteArrayOutputStream.toString(csname)
PrintStream(File, csname)
PrintStream(filename, csname)
PrintWriter(File, csname)
PrintWriter(filename, csname)

and are declared to throw UnsupportedEncodingException, which is checked, making them inconvenient to use. (Although UEE is a subtype of IOException, and calls to these often, but not always, occur in contexts where IOException is caught or declared.) Other APIs throw IllegalArgumentException if the charset name is invalid:

Scanner(File, csname)
... other Scanner constructor overloads ...

This is less inconvenient, however, for programmability, one always has to look up or remember the right name. (Is it "UTF_8" to be consistent with StandardCharsets.UTF_8? No, it's "UTF-8".)

Still other APIs don't have any charset-related parameters at all:

FileReader(File)
FileReader(FileDescriptor)
FileReader(filename)
FileWriter(File)
FileWriter(File, append)
FileWriter(FileDescriptor)
FileWriter(filename)
FileWriter(filename, append)

In all these cases an overload taking a Charset parameter should be added.

This isn't an exhaustive list. The APIs should be audited for additional occurrences of APIs that have similar issues.