RFR: 8027429: Add diagnostic command VM.info to get hs_err print-out (NEW WEBREV) (original) (raw)

Frederic Parain frederic.parain at oracle.com
Fri Nov 13 12:08:51 UTC 2015

Previous message: RFR: 8027429: Add diagnostic command VM.info to get hs_err print-out (NEW WEBREV)
Next message: RFR: 8027429: Add diagnostic command VM.info to get hs_err print-out (NEW WEBREV)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 13/11/2015 05:52, David Holmes wrote:

Hi Coleen,

On 13/11/2015 7:41 AM, Coleen Phillimore wrote:

Hi, I should learn my lesson that Frederic, Bertrand and David are generally right. My restructuring went afoul of the STEP framework, and so I have made VMError::printvminfo() report the safe things that VMError::report() reports. Hopefully, this will address the concern about safety. It also allows us to remove the lock and static buffer in the VM.info function. The duplication isn't as bad as I thought and is a better solution than piggy backing from VMError::report(). Please review: open webrev at http://cr.openjdk.java.net/~coleenp/8027429.03/ bug link https://bugs.openjdk.java.net/browse/JDK-8027429 This looks good to me. A few small queries: 888 if (Universe::isfullyinitialized()) { I don't think it is possible to run a Dcmd before the VM is fully initialized.

It is not possible to invoke a Dcmd from outside of the VM (using the attachAPI or the PlatformMBeanServer) before the VM is fully initialized. However it is still possible to invoke a Dcmd directly from VM code. The Dcmd framework is initialized in Management::init(), which is called before universe_post_init() is invoked. I don't know exactly the dependencies of Universe::heap()->print_on_error(), but the test looks reasonable.

Fred

889 Universe::heap()->printonerror(st); Is this safe? I'm not sure what it does - print heap memory regions ?? 891 st->printcr("Polling page: " INTPTRFORMAT, p2i(os::getpollingpage())); Is this useful ?? Thanks, David > I tested with jcmd VM.info concurrently and with an inserted crash during VM.info call to make sure the original hserrpid report comes out okay. >> Thanks, Coleen >> On 11/11/15 4:16 PM, Coleen Phillimore wrote: >>>>>>> On 11/10/15 7:12 PM, David Holmes wrote: >> Hi Coleen, >>>>>> I still do not like the intertwining of the info and error code, but >> I won't say more on that. >>>> Yes, I understand the approach has good arguments on both sides. I'm > trying to find a compromise that can move this feature forward. >>>>>> However you may now have a concurrency problem: >>>>>> 842 // Report for the vminfocmd, skipping crash specific code >> 843 void VMError::reportinfo(outputStream* st) { >> 844 >> 845 // don't allocate large buffer on stack >> 846 static char buf[OBUFLEN]; >>>>>> Can you have concurrent info requests in flight? For the info case >> the buffer could be allocated on the stack I think, or even >> dynamically. (Aside: will this be executed in the service thread?) >>>> There is a lock added to protect the VM.info dcmd, so there would be > only sequential access to the static buffer. I don't think this is > executed in the Service Thread. I see no evidence of this anyway. >>>>>>>>> reportprocess could possibly be refactored further as there's a lot >> of initial stuff that won't apply to the info case: >> - we won't print all threads >> - the VM state will always be "not at safepoint" >> - The Universe must be initialized (not clear at what point during >> shutdown a dcmd can still come in?) >> - we don't print owned locks >>>> Yes, I can do this. It actually improves how this looks. >>>> Thanks, > Coleen >>>>>> That's all from me. >>>>>> Thanks, >> David >>>>>>>>>> On 6/11/2015 7:01 AM, Coleen Phillimore wrote: >>>>>>>> Hi, >>>>>>>> I've done some refactoring and cleanup on this which the refactoring >>> enabled. Again, it's important that the VM.info output be mostly >>> the >>> same order as the hserrpid* report file so that sustaining can >>> easily >>> match and find output that they are looking for. It's also >>> important >>> not to copy this code because there may be new information added >>> that we >>> also want in the VM.info output. If any future information that we >>> want >>> to print seems unsafe, the vminfocmd can be used to exclude it >>> or it >>> can be printed only in the error reporting function. Rejecting the >>> entire change because of the theoretical unsafe addition of code >>> isn't >>> the right thing to do, although yeah, I've argued the other way in >>> the >>> past. This is something that is helpful to sustaining and our >>> licensees. >>>>>>>> In this patch, the functions that VM.info calls are safe while the >>> JVM >>> is running, but the stepping is required because at crash time, they >>> might not be. Although I've never seen the error reporting crash in >>> these places. >>>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8027429.01/ >>> bug link https://bugs.openjdk.java.net/browse/JDK-8027429 >>>>>>>> This refactoring has been run through RBT. >>>>>>>> Thanks, >>> Coleen >>>>>>>> On 11/4/15 2:25 AM, David Holmes wrote: >>>> On 4/11/2015 1:36 AM, Mattis Castegren wrote: >>>>> Also note that some of the crash information is not interesting for >>>>> VM.info. For example, we don't want to print a thread dump or the >>>>> registers/stack data for any specific thread. If people want thread >>>>> dumps, they can use the separate Dcmd. We only want the general >>>>> system state. >>>>>>>>>> I'm a little surprised that thread info is not of interest, but >>>> as you >>>> say that is available by other means - which was part of my earlier >>>> point about there already being safe means to get some of the info. >>>>>>>>>> If this is really only about the small subset of safe "system >>>> information" that VMError reports, then I see even less >>>> motivation to >>>> piggy-back VM-info onto VMerror and entangle the two. If each >>>> block of >>>> information can be considered a single "step" in the error case then >>>> we can simply factor out the actual printing parts and share >>>> them. If >>>> the error case wants fine-grained steps then we can't do that. >>>>>>>>>> David >>>>>>>>>>> Kind Regards >>>>> /Mattis >>>>>>>>>>>> -----Original Message----- >>>>> From: Coleen Phillimore >>>>> Sent: den 3 november 2015 16:25 >>>>> To: Bertrand Delsart; Thomas Stüfe; Mattis Castegren >>>>> Cc: David Holmes; hotspot-runtime-dev at openjdk.java.net >>>>> Subject: Re: RFR: 8027429: Add diagnostic command VM.info to get >>>>> hserr print-out >>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>> I've done a bit of refactoring with this change and have tested it >>>>> with and without intentional crashes to the VM.info reporting. >>>>> If it >>>>> crashes within VM.info, there's a stack trace that clearly shows >>>>> where it comes from in the hserr file. And the hserrpid file is >>>>> correctly produced. If we can make the jcmd come with a warning >>>>> that >>>>> it could potentially crash the vm, I think this feature is a good >>>>> addition for sustaining. >>>>>>>>>>>> Without crashing, VM.info gives a lot of information that we >>>>> normally >>>>> give with the hserr file. The reason for the careful STEP'ping in >>>>> hserr production is that the JVM has already crashed so we can't >>>>> count on anything. The VM.info command is designed to be run on a >>>>> moderately stable JVM so is less likely to crash on the things that >>>>> we're printing out. >>>>>>>>>>>> I think we should fix the Threads walking code that Fred pointed >>>>> out >>>>> to take the Threadslock, but I don't think we should copy >>>>> code for >>>>> VM.info or copy printonerror functions. >>>>>>>>>>>> Thanks, >>>>> Coleen >>>>>>>>>>>> On 11/3/15 10:18 AM, Bertrand Delsart wrote: >>>>>> On 03/11/2015 15:06, Bertrand Delsart wrote: >>>>>>> Thanks Thomas, >>>>>>>>>>>>>>>> I fully agree it can work. The problem is that it may also fail >>>>>>> if we >>>>>>> are not careful. >>>>>>>>>>>>>>>> Fred noticed for instance that "Threads::printonerror" is not >>>>>>> protected by vminfocmd (step 290). This is an obvious >>>>>>> source of >>>>>>> potential crashes if you take into account the fact that the >>>>>>> printonerror methods were designed not to use locks. >>>>>>>>>>>>>> Fred just pointed out that this may be OK since thread is NULL. >>>>>>>>>>>>>> I'm not sure of why we do not print the threads when there is >>>>>> notion >>>>>> of current thread (and I would wonder whether we should print >>>>>> them for >>>>>> VMInfo) but luckily that may have been sufficient to prevent the >>>>>> crash. >>>>>>>>>>>>>> However, I still do not like it from a design point of view for >>>>>> the >>>>>> other reasons stated in my e-mail. >>>>>>>>>>>>>> Bertrand. >>>>>>>>>>>>>>>>>>>>>>>> It seems that the focus has been on removing what was clearly >>>>>>> useless, not what might be dangerous due to how the method (and >>>>>>> what >>>>>>> it calls) has been designed. >>>>>>>>>>>>>>>> In addition, adding vminfocmd is IMHO not sufficient unless >>>>>>> you >>>>>>> pass that argument to all layers. This is the biggest design >>>>>>> issue. >>>>>>> For instance all calls to printonerror could have this >>>>>>> additional >>>>>>> argument so that the different components can adapt their >>>>>>> behavior >>>>>>> (and are aware there are now two use cases). I would start by >>>>>>> Threads::printonerror, grabbing the threads lock in the info >>>>>>> case, >>>>>>> but the same if true for all methods called by VMError::report. I >>>>>>> would go as far as adding a comment in all printonerror methods >>>>>>> stating why the method is safe for both use cases (e.g. why no >>>>>>> locks >>>>>>> are needed) or adding "assert(!vminfocmd)" for methods >>>>>>> VMError:report is supposed to skip. >>>>>>> The current design, where the various components may not be aware >>>>>>> that they are called for two different purposes, is not robust >>>>>>> enough. >>>>>>>>>>>>>>>> Note also that being 'robust enough for the "crash" case' is >>>>>>> not a >>>>>>> strong enough guarantee for a general purpose info DCmd... >>>>>>> particularly if, as you point out, an error that occurs in one >>>>>>> of the >>>>>>> info STEP crashes the VM instead of just bringing us to the next >>>>>>> error reporting STEP. Now, as stated before, I could agree if we >>>>>>> use >>>>>>> this DCmd only when a crash is acceptable. >>>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>>> Bertrand >>>>>>>>>>>>>>>> On 02/11/2015 14:32, Thomas Stüfe wrote: >>>>>>>> Hi all, >>>>>>>>>>>>>>>>>> we have implemented in the SAP JVM the same mechanism. Our >>>>>>>> implementation is in use since a number of years and is quite >>>>>>>> useful. So I would agree with Coleen. >>>>>>>>>>>>>>>>>> However, there are some things to keep in mind when doing this: >>>>>>>>>>>>>>>>>> Error handling is different between the "info-only" and "crash" >>>>>>>> case. In >>>>>>>> "info-only" case, VMError::report() is called without enclosing >>>>>>>> VMError::reportanddie() and we run with the normal signal >>>>>>>> handler >>>>>>>> intact, not with the secondary signal handler. So, normal signal >>>>>>>> handling works, which is good. But it also means that any error >>>>>>>> inside one of the error reporting STEPs will bring down the VM, >>>>>>>> instead of just bringing us to the next error reporting STEP >>>>>>>> ("Error >>>>>>>> occurred during error reporting"). >>>>>>>>>>>>>>>>>> The problem is that a number of error reporting steps are a bit >>>>>>>> risky and written with the assurance that the worst which can >>>>>>>> happen >>>>>>>> is that the STEP will be interrupted. These risky steps >>>>>>>> should not >>>>>>>> be executed in the "only-info" case. >>>>>>>>>>>>>>>>>> But the proposed change already does this, it introduces >>>>>>>> "vminfocmd" >>>>>>>> and conditionally excludes dangerous or pointless error >>>>>>>> reporting >>>>>>>> STEPs (e.g. callstack is pointless). >>>>>>>>>>>>>>>>>> As far as I can see, the other concerns concern only the "crash" >>>>>>>> case and simply do not apply to the "info-only" case. If an >>>>>>>> error >>>>>>>> reporting step is robust enough for the "crash" case (low stack >>>>>>>> usage, not allocating memory to avoid deadlocks etc), it >>>>>>>> should be >>>>>>>> safe enough for the "info" case. >>>>>>>>>>>>>>>>>> Kind Regards, Thomas >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 2, 2015 at 12:31 PM, Mattis Castegren >>>>>>>> _<mattis.castegren at oracle.com_ >>>>>>>> <mailto:mattis.castegren at oracle.com>> >>>>>>>> wrote: >>>>>>>>>>>>>>>>>> Hi >>>>>>>>>>>>>>>>>> Of course, when I wrote "Review and push this fix" I meant >>>>>>>> doing a >>>>>>>> thorough technical review. We do not want VM.info to be >>>>>>>> unsafe, and >>>>>>>> I will let David reply to the technical questions that has >>>>>>>> been >>>>>>>> raised. I know he did consider some of these issues, but >>>>>>>> last >>>>>>>> week >>>>>>>> he was on Java one, and this week he is out. I will let him >>>>>>>> comment >>>>>>>> once he gets back. >>>>>>>>>>>>>>>>>> Kind Regards >>>>>>>> /Mattis >>>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>> From: Bertrand Delsart >>>>>>>> Sent: den 2 november 2015 10:13 >>>>>>>> To: David Holmes; Mattis Castegren; Coleen Phillimore; >>>>>>>> hotspot-runtime-dev at openjdk.java.net >>>>>>>> <mailto:hotspot-runtime-dev at openjdk.java.net> >>>>>>>> Subject: Re: RFR: 8027429: Add diagnostic command VM.info >>>>>>>> to get >>>>>>>> hserr print-out >>>>>>>>>>>>>>>>>> On 02/11/2015 02:45, David Holmes wrote: >>>>>>>> > On 30/10/2015 8:38 PM, Mattis Castegren wrote: >>>>>>>> >> Hi >>>>>>>> >> >>>>>>>> >> I agree that there may be things we could add to >>>>>>>> VM.info >>>>>>>> that we >>>>>>>> >> couldn't add in an hserr file due to the fact that we >>>>>>>> are >>>>>>>> crashing. >>>>>>>> >> However, I really do believe that we should use the >>>>>>>> same >>>>>>>> code. >>>>>>>> The >>>>>>>> >> goal of the hserr file is exactly the same as the >>>>>>>> goal of >>>>>>>> VM.info, >>>>>>>> >> to print as much information about the system as >>>>>>>> possible so >>>>>>>> that >>>>>>>> >> when we look at an hserr file or the output from >>>>>>>> VM.info, >>>>>>>> we get as >>>>>>>> >> much information about the system as possible. That way >>>>>>>> we can >>>>>>>> reduce >>>>>>>> >> the need to go back and forth with the customer to ask >>>>>>>> for more >>>>>>>> information. >>>>>>>> >> >>>>>>>> >> If we enumerate what we need in VM.info, the list >>>>>>>> would look >>>>>>>> very >>>>>>>> >> much like what we already do in VMError, and >>>>>>>> therefore I >>>>>>>> think we >>>>>>>> >> should use the same code. If we find something that we >>>>>>>> would >>>>>>>> want, >>>>>>>> >> that is not currently printed in hserr files, I >>>>>>>> think we >>>>>>>> have >>>>>>>> two options: >>>>>>>> >> >>>>>>>> >> 1) Add it to hserr files as well. If it is >>>>>>>> interesting when >>>>>>>> >> gathering customer information before a crash, it >>>>>>>> probably is >>>>>>>> >> interesting after a crash >>>>>>>> >> 2) If it for some reason cannot be captured when >>>>>>>> crashing, >>>>>>>> add it to >>>>>>>> >> the code with the condition that we don't print it >>>>>>>> during >>>>>>>> crash >>>>>>>> time. >>>>>>>> >> >>>>>>>> >> I would therefore suggest the following plan of action >>>>>>>> >> >>>>>>>> >> 1) Review and push this fix to see if there are any >>>>>>>> technical issues >>>>>>>> >> with regards to thread safety, etc. I know David has >>>>>>>> had >>>>>>>> this in >>>>>>>> mind >>>>>>>> >> when doing the change. After that, we have a command >>>>>>>> that >>>>>>>> support can >>>>>>>> >> use. >>>>>>>> >> 2) Think about if there is any information missing, >>>>>>>> both >>>>>>>> from hserr >>>>>>>> >> files and from VM.info. Here, we may want to ask >>>>>>>> support >>>>>>>> about if >>>>>>>> >> there is any additional information they need to ask >>>>>>>> for >>>>>>>> when they >>>>>>>> >> get this data. If possible, this information should be >>>>>>>> added >>>>>>>> to both >>>>>>>> >> hserr files and VM.info, we don't want support to >>>>>>>> have to >>>>>>>> ask for >>>>>>>> >> VM.info if we already have an hserr file. Of >>>>>>>> course, if >>>>>>>> there is >>>>>>>> >> anything Dev would want to see when analyzing >>>>>>>> crashes in >>>>>>>> testing, we >>>>>>>> >> should add that too. >>>>>>>> > >>>>>>>> > I have no issue with VMError and VM.Info using the same >>>>>>>> code >>>>>>>> (where >>>>>>>> > appropriate), but I think it wrong to try and implement >>>>>>>> VM.Info >>>>>>>> > directly using the actual VMError mechanism. It is >>>>>>>> crude and >>>>>>>> makes it >>>>>>>> > difficult to make independent changes to either >>>>>>>> facility. >>>>>>>> > >>>>>>>> > Sorry. >>>>>>>> > >>>>>>>> > David >>>>>>>>>>>>>>>>>> Furthermore, we cannot just "Review and push this fix to >>>>>>>> see if >>>>>>>> there are any technical issues". Two reviewers have already >>>>>>>> stated >>>>>>>> that there are issues and these issues must be addressed >>>>>>>> before >>>>>>>> pushing. >>>>>>>>>>>>>>>>>> The answer cannot just be that we need this feature, it >>>>>>>> must >>>>>>>> address >>>>>>>> the technical issues that were reported. >>>>>>>>>>>>>>>>>> One possible answer is that we all agree that VMInfo >>>>>>>> would be >>>>>>>> unsafe. >>>>>>>> The method would have to be renamed UnsafeVMInfo and it >>>>>>>> must be >>>>>>>> clear that its 'impact' would be very risky, potentially >>>>>>>> crashing >>>>>>>> the JVM. >>>>>>>>>>>>>>>>>> If this is sufficient for your needs then we could agree >>>>>>>> with >>>>>>>> it. My >>>>>>>> only additional requirement would be to clearly mark >>>>>>>> that the >>>>>>>> method >>>>>>>> has been called so that we do not waste time on >>>>>>>> non-issues if >>>>>>>> calling VMInfo leads to a crash (or, even worse but less >>>>>>>> likely, to >>>>>>>> a silent data corruption that may lead to problems later). >>>>>>>> The bare >>>>>>>> minimum would be to double check that a crash during an >>>>>>>> execution of >>>>>>>> VM.info is clearly identifiable (this is worth double >>>>>>>> checking >>>>>>>> because of the use of static buffers for the error >>>>>>>> reporting, >>>>>>>> see >>>>>>>> below). >>>>>>>>>>>>>>>>>> If an 'unsafe' VMInfo DCmd is not sufficient, you should >>>>>>>> wonder >>>>>>>> whether a safe version of it is still a "small >>>>>>>> enhancement". >>>>>>>> A JEP >>>>>>>> may be required. >>>>>>>>>>>>>>>>>> VMError::report and the methods it calls (printonerror) >>>>>>>> have been >>>>>>>> written with assumptions clearly stated in the comments >>>>>>>> above >>>>>>>> VMError::report (see below the two most important ones). >>>>>>>> HotSpot >>>>>>>> error reporting strategy differed from JRockit. This >>>>>>>> explains >>>>>>>> why >>>>>>>> the JRockit approach cannot be applied as is. >>>>>>>>>>>>>>>>>> I agree with David that you can try leveraging some of the >>>>>>>> logic >>>>>>>> from VMError::report. However, to be sure there is no >>>>>>>> misunderstanding, let's be clearer about David's "(where >>>>>>>> appropriate)": we would for instance have to review all the >>>>>>>> called >>>>>>>> methods, check which ones purposefully ignore locking >>>>>>>> (potentially >>>>>>>> crashing) and provide alternate code for the unsafe ones >>>>>>>> (which may >>>>>>>> not be trivial in some cases). As a simple example, a >>>>>>>> method >>>>>>>> called >>>>>>>> "printonerror" should not be called when we are not in an >>>>>>>> erroneous condition. However, in some components of the >>>>>>>> JVM, the >>>>>>>> low >>>>>>>> level implementations of 'printonerror' and >>>>>>>> 'printinfosafely' >>>>>>>> could share the same code. >>>>>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>>>>> Bertrand. >>>>>>>>>>>>>>>>>> Here are the important comments about VMError::report >>>>>>>> assumptions: >>>>>>>>>>>>>>>>>> // In general, a hang in error handler is much worse than a >>>>>>>> crash or >>>>>>>> internal >>>>>>>> // error, as it's harder to recover from a hang. >>>>>>>> Deadlock can >>>>>>>> happen >>>>>>>> if we >>>>>>>> // try to grab a lock that is already owned by current >>>>>>>> thread, >>>>>>>> or if the >>>>>>>> // owner is blocked forever (e.g. in >>>>>>>> os::infinitesleep()). If >>>>>>>> possible, the >>>>>>>> // error handler and all the functions it called should >>>>>>>> avoid >>>>>>>> grabbing any >>>>>>>> // lock. An important thing to notice is that memory >>>>>>>> allocation >>>>>>>> needs a >>>>>>>> lock. >>>>>>>> // >>>>>>>> // We should avoid using large stack allocated buffers. >>>>>>>> Many >>>>>>>> errors >>>>>>>> happen >>>>>>>> // when stack space is already low. Making things even >>>>>>>> worse is >>>>>>>> that >>>>>>>> there >>>>>>>> // could be nested reportanddie() calls on stack (see >>>>>>>> above). >>>>>>>> Only one >>>>>>>> // thread can report error, so large buffers are statically >>>>>>>> allocated in >>>>>>>> data >>>>>>>> // segment. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>> > >>>>>>>> >> Kind Regards >>>>>>>> >> /Mattis >>>>>>>> >> >>>>>>>> >> -----Original Message----- >>>>>>>> >> From: David Holmes >>>>>>>> >> Sent: den 29 oktober 2015 23:20 >>>>>>>> >> To: Mattis Castegren; Coleen Phillimore; >>>>>>>> >> hotspot-runtime-dev at openjdk.java.net >>>>>>>> <mailto:hotspot-runtime-dev at openjdk.java.net> >>>>>>>> >> Subject: Re: RFR: 8027429: Add diagnostic command >>>>>>>> VM.info to >>>>>>>> get >>>>>>>> >> hserr print-out >>>>>>>> >> >>>>>>>> >> On 29/10/2015 11:36 PM, Mattis Castegren wrote: >>>>>>>> >>> Some background: We have this command in JRockit. The >>>>>>>> information you >>>>>>>> >>> gather when you crash to give a good summary of what >>>>>>>> system >>>>>>>> you run >>>>>>>> >>> on is pretty much exactly the information you need to >>>>>>>> get a >>>>>>>> good >>>>>>>> >>> summary on a system that has not crashed. The JRockit >>>>>>>> command is >>>>>>>> >>> extremely useful for support, and saves a lot of work >>>>>>>> going >>>>>>>> back and >>>>>>>> >>> forth asking about system information. >>>>>>>> >>> >>>>>>>> >>> Also, if we write something new in the hserr file, >>>>>>>> like if >>>>>>>> there has >>>>>>>> >>> been any out of memory errors, we often would want >>>>>>>> the same >>>>>>>> >>> information in the VM.info output. From my >>>>>>>> experience in >>>>>>>> >>> Sustaining/Support, I can't think of any >>>>>>>> information you >>>>>>>> would want >>>>>>>> >>> in VM.info that you wouldn't also want in the hserr >>>>>>>> file >>>>>>>> and the >>>>>>>> >>> other way around, apart from details about the crash >>>>>>>> (obviously). >>>>>>>> >>> >>>>>>>> >>> I don't see a reason to exactly enumerate what >>>>>>>> information >>>>>>>> VM.info >>>>>>>> >>> should provide. From a sustaining/support >>>>>>>> perspective, we >>>>>>>> want a >>>>>>>> >>> one-stop command to gather as much useful >>>>>>>> information as >>>>>>>> possible, >>>>>>>> >>> which is the same idea we have for the hserr file. >>>>>>>> >> >>>>>>>> >> The reason to enumerate what is required is to see >>>>>>>> where >>>>>>>> that >>>>>>>> >> information already exists and can be collected >>>>>>>> from. The >>>>>>>> VMError >>>>>>>> >> report has to be very careful about what it does and >>>>>>>> how it >>>>>>>> does it >>>>>>>> >> because of the fact we may have crashed and the general >>>>>>>> process >>>>>>>> state >>>>>>>> >> is indeterminate. A Dcmd can simply gather whatever >>>>>>>> information is >>>>>>>> >> required from the available sources in whatever >>>>>>>> order or >>>>>>>> format >>>>>>>> desired. >>>>>>>> >> >>>>>>>> >> I have no problem with this command, just how it has >>>>>>>> been >>>>>>>> proposed to >>>>>>>> >> implement it. >>>>>>>> >> >>>>>>>> >> David >>>>>>>> >> >>>>>>>> >>> Kind Regards >>>>>>>> >>> /Mattis >>>>>>>> >>> >>>>>>>> >>> -----Original Message----- >>>>>>>> >>> From: David Holmes >>>>>>>> >>> Sent: den 29 oktober 2015 13:08 >>>>>>>> >>> To: Coleen Phillimore; >>>>>>>> hotspot-runtime-dev at openjdk.java.net >>>>>>>> <mailto:hotspot-runtime-dev at openjdk.java.net> >>>>>>>> >>> Subject: Re: RFR: 8027429: Add diagnostic command >>>>>>>> VM.info >>>>>>>> to get >>>>>>>> >>> hserr print-out >>>>>>>> >>> >>>>>>>> >>> On 29/10/2015 10:02 PM, Coleen Phillimore wrote: >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> >>>> On 10/29/15 7:58 AM, David Holmes wrote: >>>>>>>> >>>>> On 29/10/2015 9:26 PM, Coleen Phillimore wrote: >>>>>>>> >>>>>> >>>>>>>> >>>>>> So I actually disagree. I don't think there should >>>>>>>> be an >>>>>>>> >>>>>> additional separate mechanism to get the same >>>>>>>> information >>>>>>>> that we >>>>>>>> >>>>>> get with hserr reporting. I've wanted to have a >>>>>>>> feature like >>>>>>>> >>>>>> this for a long time. >>>>>>>> >>>>>> >>>>>>>> >>>>>> I pre-reviewed this change and I thought it looked >>>>>>>> good in >>>>>>>> general. >>>>>>>> >>>>>> I didn't see the thread iteration problem that Fred >>>>>>>> refers to >>>>>>>> >>>>>> below, but I think the individual problems can be >>>>>>>> fixed. >>>>>>>> >>>>>> >>>>>>>> >>>>>> The last thing I want is this code to be copied >>>>>>>> somewhere else. >>>>>>>> >>>>> >>>>>>>> >>>>> Factored as needed not copied. VMError is not an >>>>>>>> "info" >>>>>>>> reporting >>>>>>>> >>>>> mechanism. >>>>>>>> >>>> >>>>>>>> >>>> If you look at the things that are reported in each >>>>>>>> "STEP", >>>>>>>> there's a >>>>>>>> >>>> small amount of code and the order is important. >>>>>>>> >>> >>>>>>>> >>> The order is less important in an "info" request I >>>>>>>> would >>>>>>>> think. >>>>>>>> >>> >>>>>>>> >>>> I'd like the vm info to use the same order and report >>>>>>>> what it >>>>>>>> can do >>>>>>>> >>>> safely. Refactoring 5 lines of code into functions >>>>>>>> doesn't >>>>>>>> make >>>>>>>> >>>> sense. >>>>>>>> >>> >>>>>>>> >>> I need to consider exactly what it is the "info" >>>>>>>> needs to >>>>>>>> report in >>>>>>>> >>> more detail. There are existing facilities (system >>>>>>>> properties, >>>>>>>> >>> management >>>>>>>> >>> APIs) for various bits of runtime information, which >>>>>>>> VMError can't >>>>>>>> >>> utilize but a Dcmd can. >>>>>>>> >>> >>>>>>>> >>> David >>>>>>>> >>> >>>>>>>> >>>> Coleen >>>>>>>> >>>> >>>>>>>> >>>>> >>>>>>>> >>>>> David >>>>>>>> >>>>> >>>>>>>> >>>>>> Coleen >>>>>>>> >>>>>> >>>>>>>> >>>>>> On 10/28/15 8:48 PM, David Holmes wrote: >>>>>>>> >>>>>>> I agree with Fred, this kind of info reporting >>>>>>>> should >>>>>>>> not be >>>>>>>> >>>>>>> piggy-backed onto VMError handling for the reasons >>>>>>>> stated >>>>>>>> (and the >>>>>>>> >>>>>>> error handling logic is complicated enough as it >>>>>>>> is!). For >>>>>>>> things >>>>>>>> >>>>>>> like thread lists there are already safe >>>>>>>> management >>>>>>>> functions that >>>>>>>> >>>>>>> can/should be used. >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> Thanks, >>>>>>>> >>>>>>> David H. >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> On 29/10/2015 3:29 AM, Frederic Parain wrote: >>>>>>>> >>>>>>>> Hi David, >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I haven't review all the code yet, but I'm >>>>>>>> concerned >>>>>>>> with the >>>>>>>> >>>>>>>> fact that the diagnostic command is calling >>>>>>>> VMError::report(). >>>>>>>> >>>>>>>> This method has been implemented to be executed >>>>>>>> in the >>>>>>>> particular >>>>>>>> >>>>>>>> context of fatal errors, and its usage while the >>>>>>>> VM is >>>>>>>> running >>>>>>>> >>>>>>>> normally seems dangerous. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> For instance, VMError::report() consciously >>>>>>>> avoids >>>>>>>> grabbing locks >>>>>>>> >>>>>>>> because of the risk of deadlock during the error >>>>>>>> reporting. >>>>>>>> >>>>>>>> The consequence is that some data structures are >>>>>>>> browsed in an >>>>>>>> >>>>>>>> unsafe way. One example: VMError::report() calls >>>>>>>> >>>>>>>> Threads::printonerror() which iterates over the >>>>>>>> thread list >>>>>>>> >>>>>>>> without owning the Threadslock. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> The implementation of the diagnostic command >>>>>>>> seems >>>>>>>> also to >>>>>>>> >>>>>>>> exclude a lot of reporting from the initial >>>>>>>> VMError::report() >>>>>>>> >>>>>>>> method. Have you considered implementing a new >>>>>>>> MT-safe >>>>>>>> reporting >>>>>>>> >>>>>>>> method rather than trying to modify the special >>>>>>>> VMError::report() >>>>>>>> >>>>>>>> methods? (Note that some code factorization >>>>>>>> between >>>>>>>> >>>>>>>> VMError::report() and this new method should be >>>>>>>> possible). >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Fred >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 28/10/2015 17:18, david buck wrote: >>>>>>>> >>>>>>>>> Hi! >>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>>> Please review my change for this small >>>>>>>> enhancement. >>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>>> bug: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8027429 >>>>>>>> >>>>>>>>> webrev: >>>>>>>> http://cr.openjdk.java.net/~dbuck/8027429jdk901/ >>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>>> Cheers, >>>>>>>> >>>>>>>>> -Buck >>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>> Bertrand Delsart, Grenoble Engineering >>>>>>>> Center >>>>>>>> Oracle, 180 av. de l'Europe, ZIRST de >>>>>>>> Montbonnot >>>>>>>> 38330 Montbonnot Saint Martin, FRANCE >>>>>>>> bertrand.delsart at oracle.com >>>>>>>> <mailto:bertrand.delsart at oracle.com> >>>>>>>> Phone : +33 4 76 18 81 23 >>>>>>>>>>>>>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>>>>>>>>>>>>>>> NOTICE: This email message is for the sole use of the >>>>>>>> intended >>>>>>>> recipient(s) and may contain confidential and privileged >>>>>>>> information. Any unauthorized review, use, disclosure or >>>>>>>> distribution is prohibited. If you are not the intended >>>>>>>> recipient, >>>>>>>> please contact the sender by reply email and destroy all >>>>>>>> copies of >>>>>>>> the original message. >>>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Frederic Parain - Oracle Grenoble Engineering Center - France Phone: +33 4 76 18 81 17 Email: Frederic.Parain at oracle.com

Previous message: RFR: 8027429: Add diagnostic command VM.info to get hs_err print-out (NEW WEBREV)
Next message: RFR: 8027429: Add diagnostic command VM.info to get hs_err print-out (NEW WEBREV)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the hotspot-runtime-dev mailing list