java encoding charset suggestion (original) (raw)

Martin Buchholz martinrb at google.com
Mon Mar 18 18:24:21 UTC 2013


It would be nice if the world agreed on using UTF-8 as a universal encoding for all text. However:

Standard says http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html

"""If the LANG environment variable is not set or is set to the empty string, the implementation-defined default locale shall be used."""

But I think the operating system should set the default, not the application. On my Ubuntu system I see the traditional ASCII English default:

$ (unset LC_ALL LC_COLLATE LANG LANGUAGE GDM_LANG; locale) LANG= LANGUAGE= LC_CTYPE="POSIX" LC_NUMERIC="POSIX" LC_TIME="POSIX" LC_COLLATE="POSIX" LC_MONETARY="POSIX" LC_MESSAGES="POSIX" LC_PAPER="POSIX" LC_NAME="POSIX" LC_ADDRESS="POSIX" LC_TELEPHONE="POSIX" LC_MEASUREMENT="POSIX" LC_IDENTIFICATION="POSIX" LC_ALL=

On Mon, Mar 18, 2013 at 11:09 AM, Helio Frota <heliofrota at gmail.com> wrote:

I would suggest taking enUS.UTF-8 as default when the LANG variable is not set to avoid problems with encoding.



More information about the core-libs-dev mailing list