RFR(M): 8027754: Enable loop optimizations for loops with MathExact inside (original) (raw)
Vladimir Kozlov vladimir.kozlov at oracle.com
Mon Feb 10 11:22:47 PST 2014
- Previous message: RFR(M): 8027754: Enable loop optimizations for loops with MathExact inside
- Next message: RFR(M): 8027754: Enable loop optimizations for loops with MathExact inside
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 2/10/14 4:19 AM, Rickard Bäckman wrote:
Vladimir,
thanks for your suggestions. Here is an updated webrev with the changes: http://cr.openjdk.java.net/~rbackman/8027754.3/
You missed to replace rax with general register in instructions which generate CMP. These instructions does not kill the register.
In x86_64.ad format for overflowMulL_rReg_imm() misses op2.
I asked to move can_overflow() method before IdealHelper template definition in mathexactnode.cpp. Why you kept them at the end of file?
Why you did not replace ConINode::make(phase->C, 0)?
Nothing was done with (i1 == NULL || i2 == NULL) check.
Unfortunately DOMAIN is defined to 1 by math.h so I named my new field in type.hpp to TypeInt::TYPEDOMAIN and TypeLong::TYPEDOMAIN. Not quite as pretty, hope it works. I'll file a RFE for unifying rax/eax and eflags/rflags.
Thanks, Vladimir
Thanks /R On 02/05, Vladimir Kozlov wrote: Hi Rickard,
About .ad cahgnes. I talked with Igor and he said that you have to use specific register when the effect is USEKILL. I agree with that. But when you don't need to KILL or TEMP, please use general register. Second. Mach instructions names, for example addofIrReg for OverflowAddI, are not clear. I would read addofI as 'add of integer'. I think we should use full name to match ideal node name: overflowAddIrReg. Third. It would be nice to have these new mach instructions in one place in x86.ad to avoid duplication. The only differences now between x8632.ad and x8664.ad changes are eAXRegI vs raxRegI and rFlagsReg vs eFlagsReg names. rRegI was unified sometime ago. Since we can't push anything anyway this week. Can you file RFE and rename eAXRegI, eFlagsReg and may be other similar things in x8632.ad to match x8664.ad? After that your code can be put in x86.ad. Thanks, Vladimir On 2/4/14 4:42 PM, Vladimir Kozlov wrote: I forgot to say about tests. Rickard, you don't need /othervm if flags are not used. The test will run in the same VM as jtreg tool in such case and will complete faster.
Thanks, Vladimir On 2/4/14 4:26 PM, Vladimir Kozlov wrote: Rickard,
I like new implementation because it is more clean solution (no specialized checks for FlagsProj all over our code). About changes in .ad files. Why you use specifically rax instead of normal reg? Before it was required because of projection nodes matching. Now it is not needed. You can skip affects like next because they are default: effect(DEF cr, USE op1, USE op2); New formats are missing tab "\t". For OverflowMul with immediate the temp register should be declared as next (in such case you do need to use specific register): instruct mulofIrRegimm(rFlagsReg cr, rRegI op1, immI op2, raxRegI tmp) %{ match(Set cr (OverflowMulI op1 op2)); effect(TEMP tmp); format %{ "imull tmp,tmp, tmp,op1, $op2 #overflow check int" %} insencode %{ __ imull($tmp$$Register, op1op1op1$Register, op2op2op2$constant);_ %} inspipe(ialuregregalu0); %} The question is do you want to kill an other register (rax) or you are fine if you kill the input one (op1)?: instruct mulofIrRegimm(rFlagsReg cr, rRegI op1, immI op2) %{ match(Set cr (OverflowMulI op1 op2)); effect(USEKILL op1); format %{ "imull op1,op1, op1,op1, $op2 #overflow check int" %} insencode %{ __ imull($op1$$Register, op1op1op1$Register, op2op2op2$constant);_ %} inspipe(ialuregregalu0); %} Not big deal but: in OverflowINode class missed 'virtual' keyword in Ideal() declaration. Missed 'virtual' for Value() in both classes. In OverflowNode sub() is also virtual. Why you put OverflowNode() constructor out-of-line? It is empty and could be defined in head file. mathexactnode.cpp: extra unneeded '()' and space: 46 if ( (((value1 ^ result) & (value2 ^ result)) >= 0)) { Can you move canoverflow() methods after willoverflow() methods and before IdealHelper definition? They use the same templates. Why OverflowSubLNode::canoverflow() does not have (in(1) == in(2)) check? I think you can use phase->intcon(0) instead of ConINode::make(phase->C, 0) in IdealHelper::Ideal() as you did in librarycall.cpp. In IdealHelper::Value() the check (i1 == NULL || i2 == NULL) should be before singleton() check otherwise you will reference through NULL in non-singleton case. It would be nice to have the same check in IdealHelper::Ideal(). ConINode::make(phase->C, 0); In OverflowNode::sub() use fatal() instead of ShouldNotReachHere() to print for what node it is called: fatal(errmsgres("sub() should not be called for '%s'", NodeClassNames[this->Opcode()])); Thanks, Vladimir On 2/4/14 2:00 AM, Rickard Bäckman wrote: Igor,
what about DOMAIN? Also looking for an additional Reviewer. Thanks /R
On 01/31, Igor Veresov wrote: librarycall.cpp: You forgot to remove this: 208 template 209 bool inlinemathoverflow(bool isunary); type.hpp: Come to think of it again bottom() is not really a bottom. I can’t come up with good name for it. Perhaps SIGNED ? And define it as static constant: static const TypeInt *SIGNED = INT; and static const TypeLong *SIGNED = LONG; to follow the existing convention? Otherwise looks excellent. igor
On Jan 31, 2014, at 10:22 AM, Rickard Bäckman <rickard.backman at oracle.com> wrote:
Seem to have pasted the wrong link, try this one: http://cr.openjdk.java.net/~rbackman/8027754.2/ /R On 01/31, Christian Thalinger wrote: On Jan 31, 2014, at 3:44 AM, Rickard Bäckman <rickard.backman at oracle.com> wrote: Igor, I changed the calls in librarycall.cpp according to your suggestions. The same with mathexactnode.cpp. Also did some additional cleanup there. Made willoverflow and canoverflow pure virtual in the base classes. The sub method is required as OverflowNode inherits from CmpNode. Updated webrev: http://cr.openjdk.java.net/~rbackman/8028997.2/ That webrev seems to not have all changes.
Thanks /R On 01/30, Igor Veresov wrote: Hi Rickard, In librarycall.cpp: May be it’s just me but I’d get rid of bool LibraryCallKit::inlinemathoverflow(bool isunary) and reference the arguments explicitly. Otherwise it feels like too much depth to follow through when reading the code(since we keep specialized functions like inlinemathaddExactI() anyway). Something like: bool LibraryCallKit::inlinemathaddExactI(bool isincrement) { return inlinemathoverflow(argument(0), isincrement ? intcon(1) : argument(1)); } Which will also allow you to avoid adding the IsLong enum. In mathexactnode.cpp: I’d move the AddHelper, SubHelper, MulHelper to the cpp file and reference them directly in a more verbose way. Something like: bool OverflowAddINode::willoverflow(jint v1, jint v2) const { return AddHelper::willoverflow(v1, v2); } Otherwise the reader has to go and find out what OverflowHelper means in a particular context. This will also reduce the amount of implementation specifics in the hpp.
Is there a particular reason why you define willoverflow() and canoverflow() to do ShoundNotReachHere() is the base classes instead of making them pure virtual and let the compiler make sure that you define them? I guess there is an expectation that there are going to be overflow nodes that don’t redefine these methods but are there really going to be any? You also define const Type* sub(const Type* t1, const Type* t2) const as ShouldNotReachHere() and never redefine it. What are your plans for it? igor
On Jan 30, 2014, at 5:10 AM, Rickard Bäckman <rickard.backman at oracle.com> wrote: Igor, thanks for looking at this and the suggestions. I added some templates to reduce the amount of code. Webrev: http://cr.openjdk.java.net/~rbackman/8027754.1/ Thanks /R On 01/23, Igor Veresov wrote: Nice! In librarycall.cpp: Could the LibaryCallKit::inlinemath*() family of functions be factored with templates to shave a few lines? There is quite a lot of common code. I think the overflow idiom insertion can be something like: template<typename OperationNodeType, template OverflowNodeType> void LibraryCallKit::inlineoverflow(Node* arg1, Node* arg2) { Node* op = gvn.transform(new(C) OperationNodeType(arg1, arg2)); Node* of = gvn.transform(new(C) OverflowNodeType(arg1, arg2)); inlinemathmathExact(op, of); } In mathexactnode.cpp: You’ve already commoned many things up by introducing OverflowINode and OverflowLNode in hierarchy. But it feels like some of the code there could factored up as well using template helpers. In many cases the code looks exactly the same for ints and longs, differing only in some types. igor On Jan 23, 2014, at 3:27 AM, Rickard Bäckman <rickard.backman at oracle.com> wrote: Hi all, this change is going to 9 (and backporting to 8u20). Can I please have this change reviewed? The implementation of different j.l.Math.mathExact() didn't work very well with different optimizations because there was one node that generated both control and data. This change has a new implementation where each call to j.l.Math.mathExact() generates a Overflow node and a normal math operation node (in the integer add example: OverflowAddINode and a AddINode). The Overflow node is responsible for generating control. In the end we generate assembly like: mov rdx, rdi add rdx, rsi ... mov rax, rdi add rax, rsi jo With one add instruction for the data and one for flags. Future improvements could be to try to match the Overflow and the math operation and remove one of them. Bug: https://bugs.openjdk.java.net/browse/JDK-8027754 Webrev: http://cr.openjdk.java.net/~rbackman/8027754/ Thanks /R /R
- Previous message: RFR(M): 8027754: Enable loop optimizations for loops with MathExact inside
- Next message: RFR(M): 8027754: Enable loop optimizations for loops with MathExact inside
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the hotspot-compiler-dev mailing list