RFR(M): 8141529: Fix handling of _JAVA_SR_SIGNUM (original) (raw)
Dmitry Samersoff dmitry.samersoff at oracle.com
Mon Nov 16 10:54:19 UTC 2015
- Previous message: RFR(M): 8141529: Fix handling of _JAVA_SR_SIGNUM
- Next message: RFR(M): 8141529: Fix handling of _JAVA_SR_SIGNUM
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Goetz,
On 2015-11-16 13:32, Lindenmaier, Goetz wrote:
Hi Dmitry,
I think this is a rather complex issue which is better documented in the bug than in the code. The code already references the bugID. Also, I would have to copy the explanation into three files ... I you don't object I'll leave the comment as is.
I'm OK with it. Leave it as is.
-Dmitry
Best regards, Goetz. -----Original Message----- From: Dmitry Samersoff [mailto:dmitry.samersoff at oracle.com] Sent: Montag, 16. November 2015 09:23 To: David Holmes; Thomas Stüfe; Lindenmaier, Goetz Cc: serviceability-dev; hotspot-runtime-dev at openjdk.java.net Subject: Re: RFR(M): 8141529: Fix handling of JAVASRSIGNUM David, Thank you for detailed explanation. Could we put it as a comment right into the code? -Dmitry On 2015-11-16 07:03, David Holmes wrote: On 13/11/2015 11:38 PM, David Holmes wrote:
On 13/11/2015 7:53 PM, Thomas Stüfe wrote:
Hi Goetz,
sorry for not looking at this earlier. This is a nice cleanup. Some remarks: http://cr.openjdk.java.net/~goetz/webrevs/8141529-NSIG/webrev.01/src/os/aix/vm/osaix.cpp.udiff.html
+ if (sig > MAX2(SIGSEGV, SIGBUS) && // See 4355769. + sig < NSIG) { // Must be legal signal and fit_ _into sigflags[]._ _I do not like much the MAX2() construct. I would like it better to_ _explicitly check whether the SR signal is one of the "forbidden" ones_ _the VM uses._ _I must confess I had not looked into 4355769 but this check seems rather_ _spurious. It is not at all clear to me what signals could be used here -_ _Okay should have looked into 4355769. The problem is how multiple_ _pending signals are handled. It seems that in the past (no idea if still_ _true) pending signals were handled in signal-number order (lowest_ _first), not FIFO. The problem scenario is this:_ _- thread accesses a null pointer in compiled Java code and the SEGV_ _handler will cause NPE to be thrown_ _- at the same time as the SEGV is being raised the thread is also hit_ _with the SR signal to suspend it._ _- the SR signal will be delivered first and the SR handler starts to run_ _- with signals unblocked._ _- the SEGV then gets delivered to the thread in the SR handler, and the_ _regular signal handler is run_ _- the regular signal handler tries to detect if we're running in Java_ _code so it can post the NPE, but the presence of the SR handler causes_ _that check to fail - so we abort thinking it is a real SEGV._ _I don't know how much of that is still true today. It seems strange to_ _me that a kill based directed signal can usurp a synchronous signal._ _Anyway the fix, rather workaround, for that problem, was to ensure that_ _the SRsignum is greater than any potential synchronous signal the VM_ _cares about. Why SIGBUS was included there I don't know give that:_ _a) it is already a lower signal number than SIGUSR1, SIGSEGV and SIGUSR2_ _b) we don't deliberately generate and use SIGBUS ... though perhaps_ _unsafe-fetch needs to be considered._ _A better fix in my opinion, and as mentioned in the bug, would have been_ _to disable delivery of SEGV whilst the SR handler is executing. But we_ _start to touch on some grey areas of the POSIX spec there, and likely_ _the implementation too._ _So I suggest that for this cleanup we simply leave this logic exactly as_ _is._ _Thanks,_ _David_ _other than SIGUSR1 or SIGUSR2 (if -Xrs is specified), or else a_ _real-time signal (modulo discussion below). Hijacking anything else_ _seems rather suspect._ _Maybe keep a mask defined centrally for each platform which contains_ _signals the VM needs for itself ?_ _Such masks already exist._ _+sigsett os::Aix::sigs = { 0 };_ _I would not initialize the signal set this way. sigsett is an opaque_ _type; the only way to initialize it is with one of sigemptyset() or_ _sigfillset()._ _Good catch - I overlooked that._ _http://cr.openjdk.java.net/~goetz/webrevs/8141529-NSIG/webrev.01/src/os/aix/vm/osaix.hpp.udiff.html_ _+ static struct sigaction sigact[NSIG]; // saved preinstalled_ _sigactions_ _+ static sigsett sigs; // mask of signals that have_ _+ static int sigflags[NSIG];_ _I know this is not in the scope of your change, but I would like to see_ _those removed from os::Aix and put into osaix.cpp at static filescope._ _There is no need at all to export those, and you would get rid of the_ _signal.h dependency you know have when including osaix.hpp._ _http://cr.openjdk.java.net/~goetz/webrevs/8141529-NSIG/webrev.01/src/os/bsd/vm/jsig.c.udiff.html_ _http://cr.openjdk.java.net/~goetz/webrevs/8141529-NSIG/webrev.01/src/os/bsd/vm/osbsd.cpp.udiff.html_ _On BSD, we have realtime signals._ _http://fxr.watson.org/fxr/source/sys/signal.h_ _#define SIGRTMAX 126_ _and NSIG does not contain them:_ _#define NSIG 32_ _The max. possible signal number would be 126, which unfortunately does_ _not even fit into a 64bit mask._ _So this simply limits the signal choice to not be a real-time signal -_ _same as today._ _So the code in jsig.c is broken for the case that someone wants to_ _register realtime signals, if the VM were to ever use realtime signals_ _itself, which now is not the case._ _The same is true for osbsd.cpp, where signal chaining will not work if_ _the application did have handler for real time signals pre-installed_ _before jvm is loaded._ _Chaining is only used when the JVM will catch signals. Aren't all the_ _real-time signals going to be blocked by the VM by default and so_ _chaining is not needed as no handler will exist in the VM ?? (Unless a_ _real-time signal is supplied for SRsignum)_ _I must admit I don't know if any of this code actually works for_ _real-time signals._ _Solaris:_ _The only platform where NSIG is missing?_ _Here, we calculate the max. signal number dynamically in ossolaris.cpp,_ _presumably because SIGRTMAX is not a constant and can be changed using_ _system configuration. But then, on Linux we have the same situation_ _(SIGRTMAX is dynamic) and there we do not go through the trouble of_ _calculating the max. signal number dynamically. Instead we just use_ _NSIG=64 and rely on the fact that NSIG is larger than the largest_ _possible dynamic value for SIGRTMAX._ _Linux ensures that NSIG (and thus NSIG) includes all the real-time_ _signals. But libc can expose a subset and steal some for its own use._ _Solaris does not seem to have NSIG defined, but I am sure there is also_ _a max. possible value for SIGRTMAX (the default seems to be 48). So, one_ _could probably safely define NSIG for Solaris too, so that we have NSIG_ _defined on all Posix platforms._ _Solaris doesn't have any of this SRsignum related code. A more general_ _cleanup of signal related code would potentially involve a lot of_ _cleanup._ _David_ _-----_ _On Thu, Nov 12, 2015 at 8:24 PM, Lindenmaier, Goetz_ _<goetz.lindenmaier at sap.com <mailto:goetz.lindenmaier at sap.com>> wrote: Hi David, Dmitry, I've come up with a new webrev: http://cr.openjdk.java.net/~goetz/webrevs/8141529-NSIG/webrev.01/ I hit on some more issues: - As proposed, I replaced MAXSIGNUM by NSIG - On AIX, NSIG=255. Therefore storing bits in a word does not work. I'm now using bitset functionality from signal.h as it's done in other places. sigsett is >> NSIG on linux, so it's no good idea to use it there. Why do we not do this on all platforms, provided sigsett contains all signals (incl. realtime signals) ? - In the os files I found another bit vector that now is too small: sigs. I adapted that, too. Removed the dead declaration of this on solaris. Best regards, Goetz. Kind Regards, Thomas > -----Original Message----- > From: Dmitry Samersoff [mailto:dmitry.samersoff at oracle.com <mailto:dmitry.samersoff at oracle.com>] > Sent: Donnerstag, 12. November 2015 10:05 > To: Lindenmaier, Goetz; David Holmes; hotspot-runtime- > dev at openjdk.java.net <mailto:dev at openjdk.java.net>; serviceability-dev > Subject: Re: RFR(M): 8141529: Fix handling of JAVASRSIGNUM > > Goetz, > > *BSD including OS X also defines NSIG (just checked) and if my memory is > not bogus, AIX defines it too. > > So you may consider to use NSIG on all platform. > > -Dmitry > > On 2015-11-12 11:36, Lindenmaier, Goetz wrote: > > OK I'll change it to NSIG. That's used in other places in oslinux, too. > > So it's really more consistent. > > > > Best regards, > > Goetz > > > >> -----Original Message----- > >> From: Dmitry Samersoff [mailto:dmitry.samersoff at oracle.com <mailto:dmitry.samersoff at oracle.com>] > >> Sent: Donnerstag, 12. November 2015 09:22 > >> To: David Holmes; Lindenmaier, Goetz; hotspot-runtime- > >> dev at openjdk.java.net <mailto:dev at openjdk.java.net>; serviceability-dev > >> Subject: Re: RFR(M): 8141529: Fix handling of JAVASRSIGNUM > >> > >> David, > >> > >> I think it's better to use NSIG (without underscore) defined in signal.h > >> > >> -Dmitry > >> > >> > >> On 2015-11-12 10:35, David Holmes wrote: > >>> Hi Goetz, > >>> > >>> Adding in serviceability-dev > >>> > >>> On 9/11/2015 6:22 PM, Lindenmaier, Goetz wrote: > >>>> Hi, > >>>> > >>>> The environment variable JAVASRSIGNUM can be set to a signal > >> number > >>>> do be used by the JVM's suspend/resume mechanism. > >>>> > >>>> If set, a signal handler is installed and the current signal handler > >>>> is saved to an array. > >>>> On linux, this array had size MAXSIGNUM=32, and JAVASRSIGNUM > >> was > >>>> allowed > >>>> to range up to NSIG=65. This could cause memory corruption. > >>>> > >>>> Further, in jsig.c, an unsinged int is used to set a bit for signals. > >>>> This also > >>>> is too small, as only 32 signals can be supported. Further, the > >>>> signals are mapped > >>>> wrong to these bits. '0' is not a valid signal, but '32' was. 1<<32_ _> >>>> happens to map to > >>>> zero, so the signal could be stored, but this probably was not > >>>> intended that way. > >>>> > >>>> This change increases MAXSIGNUM to 65 on linux, and to 64 on aix. It > >>>> introduces > >>>> proper checking of the signal read from the env var, and issues a > >>>> warning if it > >>>> does not use the signal set. It adapts the data types in jisig.c > >>>> properly. > >>>> > >>>> Please review this change. I please need a sponsor. > >>>> http://cr.openjdk.java.net/~goetz/webrevs/8141529-NSIG/webrev.00 > >>> > >>> This all sounds very good to me. (I must find out why Solaris is not > >>> involved here :) ). > >>> > >>> On Linux you didn't add the bounds check to os::Linux::setoursigflags. > >>> > >>> I'm also wondering about documenting where we are determining the > >>> maximum from? Is it simply NSIG on some/all distributions? And I see > >>> NSIG is supposed to be the biggest signal number + one. Also linux > >>> defines NSIG = NSIG so which should we be using? > >>> > >>> Thanks, > >>> David > >>> > >>>> Best regards, > >>>> Goetz. > >>>> > >> > >> > >> -- > >> Dmitry Samersoff > >> Oracle Java development team, Saint Petersburg, Russia > >> * I would love to change the world, but they won't give me the sources. > > > -- > Dmitry Samersoff > Oracle Java development team, Saint Petersburg, Russia > * I would love to change the world, but they won't give me the sources.
-- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia
- I would love to change the world, but they won't give me the sources.
- Previous message: RFR(M): 8141529: Fix handling of _JAVA_SR_SIGNUM
- Next message: RFR(M): 8141529: Fix handling of _JAVA_SR_SIGNUM
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]