loop customization: a key challenge (original) (raw)

Remi Forax forax at univ-mlv.fr
Fri Sep 14 00:05:24 PDT 2012


On 09/14/2012 06:22 AM, John Rose wrote:

On Sep 11, 2012, at 2:09 AM, Aleksey Shipilev wrote:

On 09/10/2012 11:13 PM, John Rose wrote: The methods strongly hint to implementors and users that bind and findVirtual + bindTo perform the obvious devirtualization. I haven't been following jsr292 development recently. Is that kind of hint already favored by Hotspot? I would like to try to do this conversion by hand and see if it helps some of our benchmarks here. Yes, findVirtual followed by bindTo routinely devirtualizes the handle. The JDK 8 version of this logic is DirectMethodHandle.maybeRebind, a private method that replaces a "virtual" or "interface" reference by a "special" one.

Of course, loop customization does not really appear to be a case of devirtualization… unless perhaps you treat each loop superstructure as a defender method on the loop kernel interface. Then L1X is really X.L1, and the JVM optimization framework can swing into action, cloning L1 into each receiver type X. So that's a MH-free way to think about it. Yes, it appears that your suggestion applies to whatever "superstructure" there is in the code. I would like to highlight that the obvious demarcation point for such the superstructure is the method call, and we can probably spare some of the syntactical pain and hard-core conspiracy from the library developers. I.e. if there is a way to say class Arrays { void apply(T[] array, @PleaseSpecialize UnaryOperator op) { // blah-blah, apply } ... } ...albeit being more limiting in "superstructure" sense, it is more clear than explicitly writing up jsr292 magics. Of course, in this sense, we can even try to desugar this to jsr292 with the conversion outlined by John, e.g. into: class Arrays { void apply(T[] array, @PleaseSpecialize UnaryOperator op) { MH superOp = #apply$$Internal.bindTo(op); superOp.invoke(array); } void apply$$Internal(UnaryOperator op, T[] array) { // blah-blah, apply } } Method calling is not the natural place to do this specialization, and this is true even though (in a different sense) almost everything is done with method calls. In the example, putting the bindTo operation next to the invoke operation forces it to happen just before each bulk request. This is OK for a one-shot API, but is probably not general enough to build up big frameworks like fork-join. The reason for this is that combining a superstructure with a kernel is logically distinct from executing the combined result, and in general needs to be specified separately from the execution. So we need a notation for a "combination request" which is distinct from and prior to the "invocation request". Representing this combination request via an annotation on a named method obliges designers to give a name to every point where the combination is requested. This is unnatural in about the same way as requiring every "break" or "continue" to have a label. (Or, every closure.) The action of combining a superstructure (I want a better name here!) with a kernel (this is a standard name) should of course be expressed as an operator or method. This operator or method must be kept distinct from the action of executing the combination. It could be MH::bindTo, or (probably better) it should be something more explicit. Alternatively, instead of an operator, it could be some kind of closure expression, one which makes it clear what are the roles of superstructure and kernel. The compiler and runtime would manage the caching of combined forms, at the point where the closure expression was implemented. (As an alternative, the combination can be made automagic, as an implicit preparation to invocation. That's basically what an inlining JIT compiler does. But my big point in all of this is that the user and/or library writer probably needs to help the system with hints about the various aspects of combination.) I hope this helps. I realize it is fairly vague. The problem is completely real, though. — John

Here is a modest benchmark of different ways to implement combination of map/reduce/forEach etc for lambdas using iterators, Brian's pipeline, Rich Hickey's combiners and method handles. https://github.com/forax/lambda-perf

I'm not a pro of writing benchmark so this one is may be flawed and there is no point to try to use it with jdk7 or jdk8 before b56 because the method handles tests will fall into the well know perf hole that was fixed recently.

As John said, the combination is fully explicit and there is no caching at all, a to to reduce or forEach in a loop will create the method handle blob again and again.

John, Christian, when asking for the assembly code, I was not able to find a version that combine the whole method handle blob as a single method as this is done when invokedynamic is used, is it the way it's supposed to work or did it miss something ?

cheers, Rémi



More information about the hotspot-compiler-dev mailing list