RFR (M): 8077144: Concurrent mark initialization takes too long (original) (raw)
Thomas Schatzl thomas.schatzl at oracle.com
Mon Mar 7 23:33:05 UTC 2016
- Previous message (by thread): RFR (M): 8077144: Concurrent mark initialization takes too long
- Next message (by thread): RFR (M): 8077144: Concurrent mark initialization takes too long
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Kim,
thanks for your comments.
On Mon, 2016-03-07 at 17:21 -0500, Kim Barrett wrote:
> On Mar 4, 2016, at 4:52 AM, Thomas Schatzl <_ _> thomas.schatzl at oracle.com> wrote: > > CR: > https://bugs.openjdk.java.net/browse/JDK-8077144 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8077144/webrev/ > Testing: > jprt, vm.gc testlist
--------------------------------------------------------------------- --------- src/share/vm/gc/g1/g1ConcurrentMark.hpp 381 void allocatelargebitmap(BitMap* bitmap, BitMap::idxt const sizeinbits); const for the sizeinbits here is superfluous.
Done.
--------------------------------------------------------------------- --------- src/share/vm/gc/g1/g1ConcurrentMark.cpp 600 // First-time allocation of the bitmaps ensures that they are zeroed out. There 601 // is no need to clear them and the liveness count data manually.
This comment seems misplaced, or even unnecessary.
The intent of this comment has been to avoid questions why the data structures are not initialized here. Removing it.
--------------------------------------------------------------------- --------- src/share/vm/gc/g1/g1ConcurrentMark.cpp 2145 SizeTFlagSetting fs(ArrayAllocatorMallocLimit, newarrayallocatormalloclimit);
--- TBD: Are we sure we're still single-threaded here? Is there a risk of futzing with ArrayAllocatorMallocLimit affecting some other thread? Don't bother figuring this out until the need for the modification is discussed. Wouldn't it be better to give ArrayAllocator a new API that specifically requests mmap or adjusts the use-malloc/mmap boundary, rather than this highly roundabout mechanism with potentially non-local effect? Perhaps a constructor argument that changes the behavior of shouldusemalloc.
Yes, I was thinking about this right today, particularly because I need edthat functionality outside of G1ConcurrentMark today anyway.
My idea has been to add something like this to the alloc/realloc methods, but I am fine with the constructor.
I am sure that at that time we are only single-threaded btw.
A separate issue is some of the uses of FlagSetting and friends have nothing to do with "flags" in the globals.hpp sense, and seem semantically suspect. Maybe an RFE here?
Filed JDK-8151412, I hope this is what you meant.
--------------------------------------------------------------------- --------- src/share/vm/gc/g1/g1ConcurrentMark.cpp 2144 sizet const newarrayallocatormalloclimit = MIN2((sizet)os::vmallocationgranularity() * 100, ArrayAllocatorMallocLimit);
I wonder why vmallocationgranularity returns int rather than an unsigned type? Doing anything about that is out of scope for this change, but unfortunately leads toward the cast. Maybe an RFE here?
Filed JDK-8151413.
--------------------------------------------------------------------- --------- src/share/vm/gc/g1/g1ConcurrentMark.cpp 2147 ArrayAllocator<BitMap::bmwordt, mtGC> allocator(false);
The use of ArrayAllocator with false freeindestructor introduces cleanup problems that are not being handled. All the new (in this change set) and existing uses of that feature appear to be discarding the information needed to free the allocated memory when done with it, because they aren't capturing the distinction between malloc'ed vs. mmap'ed. And that feature is being used and then handing the memory off to a BitMap, which could itself attempt to free the memory (if resize were to be called, which hopefully never happens in the cases at hand)
It does not.
Originally I had an extension of the BitMap::resize() method that automatically handled this. However this introduced more complexity there, and potential regressions in behavior, so I created this helper method in G1ConcurrentMark.
I guess I will see to extending BitMap after all.
and have no clue about how to do so.
A possible way to deal with this would be to have an ArrayAllocator support swap (even better would be C++11 move, but...) and swap the external allocator with the one in the BitMap, so that when the BitMap is destroyed it will properly deal with the deallocation. This is a pre-existing issue, with the current change introducing a new occurrence of the problem.
This is a known issue.
--------------------------------------------------------------------- --------- src/share/vm/gc/g1/g1ConcurrentMark.cpp 2166 allocatelargebitmap(®ionbm, (BitMap::idxt)(g1h ->maxregions()));
Why cast the maxregions value? I think currently there is no problem (looks like uint => sizet). We might someday turn on things like -Wconversion, in which case this cast would simply mask a (potential) problem if a problem exists then.
Fixed.
--------------------------------------------------------------------- --------- src/share/vm/gc/g1/g1ConcurrentMark.cpp 2167 allocatelargebitmap(&cardbm, alignsizeup(g1h ->reservedregion().bytesize(), CardTableModRefBS::cardsize) / CardTableModRefBS::cardsize);
Abuse of alignsizeup macro, which is supposed to be for use in places where a constant expression is required, with alignsizeup function being preferred. Unfortunately, then one runs afoul of the not very useful signature for alignsizeup. We could spend a lot of time discussing shortcomings in that vicinity. Maybe an RFE here?
You mean, in the vicinity of align_size_up? I agree. What kind of RFE do you suggest here? Do you mind filing an RFE here because apparently you have lots of issues in mind...
--------------------------------------------------------------------- --------- src/share/vm/gc/g1/g1ConcurrentMark.cpp 2169 countcardbitmaps = NEWCHEAPARRAY(BitMap, maxworkerid, mtGC);
Extra space before maxworkerid.
Fixed.
--------------------------------------------------------------------- --------- src/share/vm/gc/g1/g1ConcurrentMark.cpp 2176 countmarkedbytes = Padded2DArray<sizet,_ _mtGC>::createunfreeable(maxworkerid, g1h->maxregions(), false);
The call to createunfreeable appears to be incorrect. The third argument is a sizet* to the allocationsize. This call should be implicitly converting false to a null pointer, and leaving the new forcepretouch argument with its default true value. This should have been caught by gcc -Wconversion-null, which is enabled by default and I don't see any disabling of it in our build system. I'm puzzled as to how this builds? What am I missing here?
You were looking at an old version of the change where this has not been fixed. Sorry. The code in the local version looks as follows:
size_t dummy; _count_marked_bytes = Padded2DArray<size_t, mtGC>::create_unfreeable(_max_worker_id, _g1h->max_regions(), &dummy, false);
This is the only difference to the version in the webrev.
--------------------------------------------------------------------- --------- src/share/vm/gc/g1/g1ConcurrentMark.cpp 2193 os::pretouchmemory(countmarked, countmarked + (sizet)G1CollectedHeap::heap()->maxregions() * sizeof(sizet));
Another unnecessary cast of maxregions.
Fixed.
--------------------------------------------------------------------- --------- src/share/vm/memory/allocation.hpp 748 // Indicates whether the last (re-)allocation used the C-heap malloc or virtual space. 749 bool usemalloc() const { return usemalloc; }
I think usedmalloc would be a better name for a query; usemalloc seems like a request/command.
Fixed.
--------------------------------------------------------------------- --------- src/share/vm/utilities/bitMap.cpp 66 os::pretouchmemory((char*)wordaddr(0), (char*)wordaddr(size()));
Regarding the casts, here and in new G1PretouchInternalBitmapsTask, why does pretouchmemory want char* rather than void*? Maybe an RFE here? There are a few existing calls that could be cleaned up too.
Potentially because at the time the call had been introduced, char* fit. Filed JDK-8151414.
--------------------------------------------------------------------- --------- test/gc/g1/Test2GbHeap.java 28 * Skip test on 32 bit Windows: it typically does not support the many and large virtual memory reservations needed. 29 * @requires (vm.gc == "G1" | vm.gc == "null") & ((sun.arch.data.model != "32") | (os.family != "windows"))
jtreg supports multiple @requires options, implicitly and'ing them together (*). Splitting the @requires here would make it more readable. And could the second clause be written as the following? !((sun.arg.data.model = "32") & (os.family = "windows"))
Fixed.
Thanks again for your review. I will prepare a new webrev later.
Thanks, Thomas
- Previous message (by thread): RFR (M): 8077144: Concurrent mark initialization takes too long
- Next message (by thread): RFR (M): 8077144: Concurrent mark initialization takes too long
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]