8152312: ParNew: Restore preserved marks in parallel (original) (raw)

Thomas Schatzl thomas.schatzl at oracle.com
Fri Mar 25 12:56:49 UTC 2016


Hi,

On Thu, 2016-03-24 at 11:36 -0400, Tony Printezis wrote:

Thomas,

Thanks for the feedback. Updated webrev here: http://cr.openjdk.java.net/~tonyp/8152312/webrev.1/ Changes: - Your main comment was whether to make the serial vs. parallel decision in DefNew or in restore(). I liked the suggestion of either passing NULL or the workers to restore(WorkGang*) and deciding in restore(…) what to do. The only problem is that ParallelGC doesn’t use a WorkGang, it uses a GCTaskManager instead.

Drat!

And anything we do in restore(...) would be nice to also take advantage of in ParallelGC. So, I came up with a cunning scheme to do that. Let me know what you think. :-) Yes, it works fine with ParallelGC too (I already implemented/tested the changes, I’ll open them for code review once this is done.) I left a restore() method there temporarily to be used by ParallelGC but it will go away.

Some specialized implementation of the method based on the type of E I guess. Note that we now basically duplicate the restore() method, it gets close to just have two (parallel gc, the rest) implementations of the class. Let's see your solution.

Since I do not know the actual implementation of your scheme, but it would be nice if the code between lines 65 and 68 in preservedMarks.inline.hpp were moved into a method too, particularly if the future implementation also has a single-threaded path.

I would prefer to have the implementation(s) of restore() in the cpp file(s) though (and possibly hide the one for parallel somewhere very far away ;) ). The additional call in this case does not seem to be problematic vs. performance.

- Yes, if each stack has a small number of entries we’re better off doing the restoration serially instead of firing up the workers. This will be a small localized change to what I have in the webrev (we’ll only have to change restore(E*);

One comment here: the comment at

70 // Right now, if the executor is not NULL we do the work in 71 // parallel. In the future we might want to do the restoration 72 // serially, if there's only a small number of marks per stack.

Imo is better placed before line 64, cut a lot to something like:

// Decide on the amount of parallelism.

Please add an RFE in the bug tracker explaining the situation about "in the future" ideas. That would be much better than dumping your thoughts in some random place in the code.

the main question will be what policy to use!). I'll do a follow-up to this too.

The main goal is not to be perfect imo, but to determine a reasonable somewhat conservative upper bound on the amount of threads used. Too many people use too large machines on small problems.

One could argue, that work units < 1ms (or another low value) are not worth distributing. From that one could determine a "reasonable" minimum number of references to process per thread, from which you could then deduct a range of PreservedMarks buffers that each thread could use (assuming that the actual references are evenly distributed across these sets of references).

At least that's what I would start with.

- Ah, yes, I’m glad there’s an atomic add for sizet in JDK 9. :-)

- Re: ParRestoreTask vs. RestoreTask : I tend to add “Par” as the tasks should be safe to use in parallel (even if in practice they are not). Hope that’s OK.

Another reason is that nowadays practically everything done during a STW pause must somehow be implemented to run in parallel to be actually useful (except for serial gc obviously). So this marker is kind of misleading and redundant.

Not really insisting on this, but it may be that some future general cleanup will modify this (nothing actually planned from me).

Thanks, Thomas



More information about the hotspot-gc-dev mailing list