proposal: runtime: add a mechanism for specifying a minimum target heap size · Issue #23044 · golang/go (original) (raw)
I propose that we add a GC knob (either an environment variable or a function in runtime/debug) which allows users to set the minimum target heap size for the garbage collector.
For now I'll call this setting GOGCMIN pending a better name.
Right now, the one GC tunable available to users is GOGC. From the runtime documentation:
The GOGC variable sets the initial garbage collection target percentage. A collection is triggered when the ratio of freshly allocated data to live data remaining after the previous collection reaches this percentage. The default is GOGC=100.
The idea is that when the target heap size is calculated based on live data and GOGC, if that target is less than GOGCMIN, then GOGCMIN is used instead.
The problem
It has often been noted that programs which make a lot of allocations while maintaining a small live heap end up doing excessive garbage collections.
At my company, we've noticed this a number of times. It's typically a problem with data processing applications which might read and write messages from queues at a large rate, yet keep very little data live over the long term.
We've had to address CPU usage due to such excessive garbage collections for at least three separate applications. Here are two real situations we observed:
- 29 GB of system memory; 40 MB heap; 500 MB/sec allocation; 30 collections/sec
- 480 GB of system memory; 100 MB heap; 700 MB/sec allocation; 12 collections/sec
In all cases, we would prefer that the application use a lot more memory in order to do fewer collections.
Existing workarounds
GOGC
The knob that's available for controlling this situation is GOGC, described above. When we've come across these issues in the past, we've set high GOGC values and that largely fixes the problem, at least in the short term.
Unfortunately, this is a fragile fix. We don't actually care about the GOGC ratio; we want to target a particular heap size. So if we have a 40 MB heap, we might back into a GOGC value like 1000 or even 10000 in order to target a 400 MB or 4 GB heap size, respectively. With large GOGC values, the application is extremely sensitive to small increases in the live heap size: if our 40 MB heap increases to 100 MB (not a large jump), then our 4 GB target becomes 10 GB.
In fact, we recently had crashes with one application where we set GOGC=1200 many months ago when its typical heap size was a few hundred MB. The live data size increased to several GB and then the service started OOMing.
SetGCPercent
One way to address the shortcomings of GOGC is to dynamically adjust its value. This is possible by using runtime/debug.SetGCPercent.
We tried a solution that involved a long-running goroutine in every application watching memory use (via runtime.ReadMemStats) and adjusting SetGCPercent.
There are at least two problems with this approach:
- We can't react to arbitrarily fast increases in heap size. If the heap is very small and we set SetGCPercent very high, we are prone to OOMing if the heap grows suddenly before we adjust SetGCPercent again. (And we don't want to make the adjustments too frequently because it's not cheap.)
- As far as I can tell, the live heap size from the end of the previous collection is not exposed by runtime.MemStats. (I think it's only available by parsing gctrace output.) This is the number we really need in order to accurately pick a GOGC value.
Heap ballast
We eventually settled on an awful workaround: we have a long-running goroutine manage a set of dummy allocations (ballast). When the heap is small, the ballast is large; if the live heap reaches the target size, the ballast shrinks to zero.
We can't pick the ballast as accurately as we would like because, again, we need the live heap size from the previous GC cycle. But by using the total non-ballast heap size as a conservative proxy, the solution works well enough. In particular, by keeping GOGC at a normal level (usually 100), we aren't subject to the heap size spike issue.
Obviously this isn't a great solution for the long term since it wastes memory that could otherwise be used for something else (like disk cache).
Related discussions
A related idea is to have a mechanism for limiting the max heap size (see #16843 and other linked discussions). However, I believe that a min size is a much simpler problem to solve since it doesn't require application coordination (backpressure).
Also related to the old issue #9067.
I'm happy to give this the full proposal document treatment if that's useful. It seems like a simple idea that doesn't necessarily need it.
Based on my limited understanding of the garbage collector, this would be easy to implement and wouldn't add much complexity to the GC.
/cc @aclements @RLH