JEP 248: Make G1 the Default Garbage Collector (original) (raw)
charlie hunt charlie.hunt at oracle.com
Mon Jun 1 15:25:10 UTC 2015
- Previous message: JEP 248: Make G1 the Default Garbage Collector
- Next message: JEP 248: Make G1 the Default Garbage Collector
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Ben,
A couple things to keep in mind.
1.) The impact of this JEP is limited to those Java applications that currently do not set a GC explicitly at the JVM command line. I think this is important to keep in mind as a starting point. A data point you may be able to help with is a survey of applications that do not set a GC explicit as a JVM command line option. I recall only seeing one Java application over the last 15 years that did not set a GC as a JVM command line option, and it was a Java GUI app.
2.) This JEP’s intent is not to replace the throughput collector (Parallel[Old]GC). The same applies to CMS GC. Folks who want to use, and do use the throughput collector or CMS GC can still use them.
3.) One might argue that if a GC is not explicitly specified at the JVM command line, then tuning GC may not be important for that application. In the event an application that does not set a GC explicitly at the command line experiences an observable performance regression, it would be able to restore its performance by setting -XX:+UseParallelOldGC on the JVM command line.
To summarize, this JEP is about what GC to use when none is specified at the JVM command line. Hence its impact is limited to those configurations.
To me it becomes a question of how many Java applications do not set an explicit GC at the command line, and how many of those want peak throughput performance with little concern of (occasional high) latency? This is a question I think the community could help us with.
hths,
charlie
On Jun 1, 2015, at 9:42 AM, Ben Evans <ben at jclarity.com> wrote:
Hi Vitaly, (I've added hotspot-dev back on to the To: line as I think it's important this discussion is had in public). In general, Mark has outlined a design philosophy for the platform that is conservative, and where, if features are not ready, then they are slipped to the next major release. Features shouldn't be rushed or releases delayed, instead production quality features should be shipped when done. So, to my mind, this issue comes down to whether the proposed benefit is such that it outweighs the risks of changing the behaviour of millions upon millions of installations. We don't have any systematic data (which I argue should be a huge red flag in itself), and the experience of consultants and performance engineers, including Kirk and myself, is not exactly encouraging. So, does this change really justify the risk? I would also question the conclusion that all we can organise before Java 10 is: "some reports from the field". For Java 8, the community was able to engage with a pretty good group of F/OSS libraries & help them to test on betas of 8, so they (& their users) could have confidence that they would "just work" with 8 straight out of the box. I see no reason why a similar approach could not work for G1 becoming default - we can approach relevant partners in the ecosystem (e.g. Cloudbees, Blazemeter, etc) and see if they can help, and we can directly reach out and get people testing with G1. However, there is an issue of timing and available resources here - there's a lot going on for JDK 9 as it is, and I don't know how easy it would be to get this programme running as well. Finally, the other issue that I'd like to address is that of scope creep. I'd always been under the impression that G1 was thought of as the CMS replacement. However, (and admittedly a lot of the systems I see are either financial or gaming) in its current state there is no way that G1 is a general replacement for CMS. The pauses for G1 are simply too long for a big class of low-latency systems. Instead, G1 is now being talked of as a replacement for the default collector. If that's the case, then I think we need to acknowledge it, and have a conversation about where G1 is actually supposed to be used. Are we saying we want a "reasonably high throughput with reduced STW, but not low pause time" collector? If we are, that's fine, but that's not where we started. Thanks, Ben On Mon, Jun 1, 2015 at 3:05 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote: Kirk,
I don't dispute that some people aren't tuning/touching the GC controls, and may get negatively impacted (but perhaps positively too). My main point, however, is I don't see waiting until java 10 as adding sufficient safety guards; certainly there will be more lab time and benchmarking at oracle, some reports from the field but inevitably there will be unknown workloads in the wild that still don't work well even after more "due diligence". If G1 is truly the successor to CMS, kicking the can further down the road isn't helping achieve that. Anyone seeing a regression has an easy way to opt out. Any such change will always weed out some outliers, java 9, 10 or 15. The longer we wait, the harder it may be to fix some of them. sent from my phone On Jun 1, 2015 9:43 AM, "Kirk Pepperdine" <kirk at kodewerk.com> wrote:
Hi Vitaly, Ben has only re-iterated what I’ve already said but in a more concise way. And, I don’t mean to be insulting but I don’t really buy into the argument that people will be specifying a collector anyways because there are still a significant number that use the parallel collector. In fact, just today, I recommended that someone move away from G1 to the parallel collector as that use case clearly favored the recommendation. And I should add, I’ve now backed a number of deployments off of tiered-compilation as IME it is impacting performance in a negative way. Regards, Kirk On Jun 1, 2015, at 3:05 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
Ben,
The customers using CMS won't be impacted since they're explicitly specifying the GC. Java 9 will already require extensive testing for people, and GC performance is luckily one of the more introspectable facilities. Furthermore, people who are keen on staying with the default collector should/can lock that in before moving to Java 9 since presumably there will be enough visibility of this change in release notes and such. Personally, I find changing default JIT compilation policy to tiered in java 8 a more risky change, but I don't recall seeing such fervor around it :). sent from my phone On Jun 1, 2015 6:37 AM, "Ben Evans" <ben at jclarity.com> wrote:
Hi,
I'm somewhat late to this, having missed the original discussion whilst travelling. Mark targeted this JEP to JDK 9 but has since put that on hold to allow more discussion. I made this comment to Mark on jdk9-dev: "I have been working with G1 for ~5 years, ever since it was experimental (& highly crash-prone in JDK 6). In the intervening time, I have seen dozens (if not hundreds) of installations, across a wide range of customers. I have participated in, or been consulted on at least a dozen direct trials of GC alternatives. It is only in the last 18 months that I have seen any real-life workload on G1 beat the alternatives, and only in the last 12 months that I've had any customer prepared to go live with G1 in production. From my experience, I think that G1 is a fine collector, with a bright future that should be pursued. However, I haven't seen anything that would make a switch to it as default collector seem compelling in the JDK 9 timeframe. Obviously, my experience is not universal, so I'd like to ask you / Oracle: 1) Can you explain the survey methodology and customer testing that you performed to arrive at the conclusion that G1 is ready to become default? 2) Can you share aggregate results of the surveying ("We worked with X customers and ran Y tests of G1 vs alternatives, and in Z% of cases, G1 worked better by W margin")? 3) Can you ask some of the customers you worked with to speak publicly about the trials you ran with them?" From reading this thread, am I right to conclude that no formal study of this issue has been done? If that's the case, then are we really happy to make G1 default without some more systematic efforts and attempts to obtain actual numbers? The questions that I'd like to see answered are: a) How short a pause time can G1 support being tuned to? 50ms? 20? Personally, I haven't seen it getting close to CMS in terms of STW time. b) What is the impact on throughput due to G1? I do like G1 as a collector, but can we really organise enough field tests in the pre-9 timeframe to justify such a large and potentially breaking change? We managed to do some good community compatibility testing for JDK 8, and we could think about a similar effort for "make G1 default". However, with modules, HTTP/2 and JShell all happening for 9, I question whether there is simply enough community bandwidth to do a decent effort for G1 as well, whereas, if we were targeting JDK 10 we'd have a lot more time to plan and to try to improve the quality and range of the field data to hopefully de-risk a potential large, high-profile failure. Thanks, Ben
On Thu, Apr 30, 2015 at 2:55 PM, Monica Beckwith <monica at beckwithclan.com> wrote: I am also FOR the change in the default GC. Charlie and Mattis bring up great points. It's about time G1 gets put out there (as the default GC) since most of the development work is going into G1. As for documentation, we not only need to document the change in the default collector but also the defaults for the collector; that are enabled as soon as G1 is employed - e.g. MaxGCPauseMillis, IHOP, etc. With more and more input coming in, G1 is only going to get better and hopefully more adaptive :) And as for Charlie's question - I don't remember the last time that I didn't see an explicit GC mentioned on the command line (even if it was the default GC). These are just my two cents. -Monica On 4/30/15 8:17 AM, charlie hunt wrote: Fwiw, we should not forget that anyone who is currently specifying an explicit GC to use in his or her JVM command line args will not experience any difference in behavior. They will still get the collector they specify to use. The (potential) impact will be on those who do not specify a GC to use. What I would like to hear from Kirk and others who frequently work with customers on GC, what’s the percentage of Java applications they have worked with that do not explicitly specify a GC? And, of those, what percentage of those apps fall into the categories of small heap and desire low latency, or desire high throughput even at the cost of frequent full GCs? thanks, charlie On Apr 30, 2015, at 7:27 AM, Mattis Castegren <mattis.castegren at oracle.com> wrote: Hi. I also work with customers but I would like to give an argument FOR changing the default. I don't think we will ever come to a point where G1 is better for ALL users. Even with a near perfect G1 implementation there may be cases where the parallel collector gives better throughput. Right now, I think G1 will be better for most users. There are probably also corner cases where G1 COULD be better, but where small issues reduces performance. By changing the default to G1, we will be able to easier find these as we will expose more users to G1. Finally, there will be a set of users who only care about throughput, and who will see a performance regression. In those cases, they can go back to using parallel. But hopefully, there will be far fewer users who need to tune their application to run with parallel GC than there are users who have to (or should) tune their application to run with G1. In the case of huge, business critical, applications, we will always introduce a risk by changing default collectors. This is true if we change to G1 in JDK 9, 10 or 11. I prefer to just rip the band aid off. We know that the collector we will focus on going forward is G1, so we should let as many people use it as possible. Of course we should document this a lot, so that users who go up to JDK 9 and see performance regressions can at least try to run with Parallel to see if it is due to the GC. Kind Regards /Mattis -----Original Message----- From: Kirk Pepperdine [mailto:kirk at kodewerk.com] Sent: den 30 april 2015 13:18 To: Stefan Johansson Cc: hotspot-dev at openjdk.java.net Source Developers Subject: Re: JEP 248: Make G1 the Default Garbage Collector Hi Stefan, Indeed, the improvements have been amazing. I have been getting many clients to bench with it and although the results have been mixed, overall many have been able to move forward. However I still would not recommend G1 to anyone who can't move to 1.8.040. Of course this change will obviously come post 40 but still, the recent emergence of the G1 as a viable production ready collector suggests that making it a default maybe a wee bit optimistic. The change is based on the assumption that limiting latency is often more important than maximizing throughput. If this assumption is incorrect then this change might need to be reconsidered. I would agree with this assumption. In most cases latency is more important. However G1 doesn't always provide lowest latency especially in smaller heaps. G1 is seen as a robust and well-tested collector. It is not expected to have stability problems, but becoming the default collector will increase its visibility and may reveal previously-unknown issues. I not sure it's prudent to treat the entire Java eco-system as guinea pigs. I believe it's more prudent to have the willing take that first step rather than have it unwittingly dropped on everyone At the end of the day, I don't have any say in any of this (as it should be). All I can do is let you know what I'm seeing through my straw with the hope that you'll find the information useful. From what I see, there is not nearly enough experience in the tuning the G1 in that is especially true in the general population to make this type of change at this point in time. I'm also not sure that we have all the tuning options we need to ensure "happy apps" in the wild. For example, I think the incremental accumulated waste in tenured regions is a problem that I'm not sure we have the tools to solve. I'm not even sure if it's a recognized problem. In fact I'm not even sure it's a real problem as at the moment it's only a theory based on observations I'm making by looking at numbers of GC logs produced by applications using recent releases of the G1. I would suggest that for Tiered the default config for 8 is was also a bit premature. I've had to have a number of clients have to roll back off of it. - Kirk On Apr 29, 2015, at 3:03 PM, Stefan Johansson <stefan.johansson at oracle.com> wrote: Hi Kirk, A lot of effort is put into G1, it has been continuously improving over the last couple of years and we now believe that G1 is ready to become the default. G1 will not improve all use case, but the same is true for the other collectors. For users where throughput is the main concern, Parallel GC can still be used by specifying -XX:+UseParallelGC on the command-line. Regards, Stefan On 2015-04-29 09:10, Kirk Pepperdine wrote: Hi all, Is the G1 ready for this? I see many people moving to G1 but also I'm not sure that we've got the tunable correct. I've been sorting through a number of recent tuning engagements and my conclusion is that I would like the collector to be aggressive about collecting tenured regions at the beginning of a JVM's life time but then become less aggressive over time. The reason is the residual waste that I see left behind because certain regions never hit the threshold needed to be included in the CSET. But, on aggregate, the number of regions in this state does start to retain a significant about of dead data. The only way to see the effects is to run regular Full GCs.. which of course you don't really want to do. However, the problem seems to settle down a wee bit over time which is why I was thinking that being aggressive about what is collected in the early stages of a JVMs life should lead to better packing and hence less waste. Note, I don't really care about the memory waste, only it's effect on cycle frequencies and pause times. Sorry but I don't have anything formal about this as I (and I believe many others) are still sorting out what to make of the G1 in prod. Generally the overall results are good but sometimes it's not that way up front and how to improve things is sometimes challenging. On a side note, the move to Tiered in 8 has also caused a bit of grief. Metaspace has caused a bit of grief and even parallelStream, which works, has come with some interesting side effect. Everyone has been so enamored with Lambdas (rightfully so) that the other stuff has been completely forgotten and some of it has surprised people. I guess I'll be submitting a talk for J1 on some of the field experience I've had with the other stuff. Regards, Kirk On Apr 28, 2015, at 11:02 PM, mark.reinhold at oracle.com wrote: New JEP Candidate: http://openjdk.java.net/jeps/248 - Mark
-- Ben Evans, Co-founder jClarity @jclarity -- Ben Evans, Co-founder jClarity @jclarity
- Previous message: JEP 248: Make G1 the Default Garbage Collector
- Next message: JEP 248: Make G1 the Default Garbage Collector
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]