Java 8 RFR 8011194: Apps launched via double-clicked .jars have file.encoding value of US-ASCII on Mac OS X (original) (raw)
Brent Christian brent.christian at oracle.com
Wed Jul 31 13:43:46 PDT 2013
- Previous message: Java 8 RFR 8011194: Apps launched via double-clicked .jars have file.encoding value of US-ASCII on Mac OS X
- Next message: Java 8 RFR 8011194: Apps launched via double-clicked .jars have file.encoding value of US-ASCII on Mac OS X
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 7/30/13 4:06 PM, David DeHaven wrote:
Judging from the docs, nllanginfo seems like a Unix portability function (something more likely to be happier with ASCII in a terminal), not something to be used by a native Cocoa application.
Exactly - so I think it expects to be called from a cmdline with a shell-style surrounding environment, with LANG/etc variables set.
David suggests that calling nl_langinfo() is "asking the wrong question." In the particular context of double-click launching on Mac, you could say that's true (or at least asking the question in the wrong way).
But consider - the code in question is shared with other Unix platforms, and when running from the cmdline/shell scripts/etc, nl_langinfo() is the right way to ask the question.
To ask the right question for this specific context on MacOS X (NSLocale or CFLocale) I suspect would involve a fair amount of code surgery, and the end result would be the same. Given this, I think my proposed change is a good one from a practical standpoint.
Thank you, everyone, for your feedback.
-Brent
Apple is highly unlikely to change the behavior of nllanginfo().
There is already code in the JDK that calls into JRSCopyPrimaryLanguage(), JRSCopyCanonicalLanguageForPrimaryLanguage(), and JRSSetDefaultLocalization() for exactly this purpose. Please proceed with setting the encoding to UTF-8. It is the de-facto standard for every Cocoa application I have ever seen. US-ASCII is always the wrong choice for a graphical app on OS X. Regards, Mike Swingler Apple Inc. On Jul 30, 2013, at 9:05 AM, Francis Devereux <francis at devrx.org> wrote:
I suspect that Apple might be unlikely to change the value that nllanginfo returns when LANG is unset.
However, it might be possible to fix this issue without second-guessing the character set reported by the OS by calling [NSLocale currentLocale] (or the CFLocale equivalent) instead of nllanginfo. I think (although I haven't checked) that that [NSLocale currentLocale] determines the current locale using a mechanism other than environment variables, because LANG is usually be unset for GUI apps on OS X. On 30 Jul 2013, at 15:56, Scott Palmer <swpalmer at gmail.com> wrote:
Then shouldn't you be complaining to Apple that the value returned by nllanginfo needs to be changed? David's point seems to be that second guessing the character set reported by the OS is likely to cause a different set of problems.
Scott
On Tue, Jul 30, 2013 at 10:14 AM, Johannes Schindelin <_ _Johannes.Schindelin at gmx.de> wrote: Hi, On Tue, 30 Jul 2013, David Holmes wrote: On 30/07/2013 5:54 AM, Brent Christian wrote: On 7/28/13 10:13 PM, David Holmes wrote: On 27/07/2013 3:53 AM, Brent Christian wrote: Please review my fix for 8011194 : "Apps launched via double-clicked .jars have file.encoding value of US-ASCII on Mac OS X"
http://bugs.sun.com/viewbug.do?bugid=8011194 In most cases of launching a Java app on Mac (from the cmdline, or from a native .app bundle), reading and displaying UTF-8 characters beyond the standard ASCII range works fine. A notable exception is the launching of an app by double-clicking a .jar file. In this case, file.encoding defaults to US-ASCII, and characters outside of the ASCII range show up as garbage. Why does this occur? What sets the encoding to US-ASCII? "US-ASCII" is the answer we get from nllanginfo(CODESET) because no values for LANG/LC* are set in the environment when double-clicking a .jar. We get "UTF-8" when launching from the command line because the default Terminal.app setup on Mac will setup LANG for you (to "enUS.UTF-8" in the US). Sounds like a user environment error to me. This isn't my area but I'm not convinced we should be second guessing what we think the encoding should be. Except that that is not the case here, of course. The user did not set any environment variable in this case. So we are not talking about "second guessing" or "user environment error" but about a sensible default. As to US-ASCII, sorry to say: the seventies called and want their character set back. There can be no question that UTF-8 is the best default character encoding, or are you even going to question that? What if someone intends for it to be US-ASCII? Then LANG would not be unset, would it. Hth, Johannes
- Previous message: Java 8 RFR 8011194: Apps launched via double-clicked .jars have file.encoding value of US-ASCII on Mac OS X
- Next message: Java 8 RFR 8011194: Apps launched via double-clicked .jars have file.encoding value of US-ASCII on Mac OS X
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]