Request for reviews (L): 7063628: Use cbcond on T4 (original) (raw)
Tom Rodriguez tom.rodriguez at oracle.com
Wed Jul 20 09:58:03 PDT 2011
- Previous message: Request for reviews (L): 7063628: Use cbcond on T4
- Next message: Request for reviews (S): 7067288: compiler regression test Test7052494 timeouts with client VM
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Don't forget to clear the temporary label:
n->as_Mach()->label_set(fakeL, 0);
Otherwise this looks good.
tom
On Jul 15, 2011, at 7:21 PM, Vladimir Kozlov wrote:
I updated changes based on suggestions:
http://cr.openjdk.java.net/~kvn/7063628/webrev vs previous webrev: http://cr.openjdk.java.net/~kvn/7063628/webrev.diff Restored original code for macroassembler branch instructions. Added new version of these instructions for short distance which can use cbcond. They have short name suffix to indicate that. New instructions generate nop in delay slot if cbcond can't be used and they don't have annul parameter (not needed). Use original instructions if not nop instruction is needed in delay slot, I filed RFE 7067794 which will allow to avoid it. brzero() code is restored and it is renamed to cmpzeroandbr(). For short distance cmpandbrshort(reg, 0, equal, pt, L) is used instead of brzero(). Restored original code in bronregcond() instruction since in most cases it has filled delay slot. And I filed RFE 7066841 to replace this instruction since I think it is not safe. Thanks, Vladimir Tom Rodriguez wrote: On Jul 14, 2011, at 4:43 PM, Vladimir Kozlov wrote:
Thank you, Tom
Tom Rodriguez wrote: assemblersparc.hpp: If we wanted to be clever about the delay slot issue we could add an function like maybedelayed() that would emit the instruction in a side buffer and place it in the right place for these instructions. Basically you could emit it as if it were not delayed and then move it into the delay slot if you actually have one. So instead of: __ brnull(G3scratch, false, _ pt, skipfixup, false);_ __ delayed()->ldptr(G5method, inbytes(methodOopDesc::interpreterentryoffset()), G3scratch);_ you do _ maybedelayed()->ldptr(G5method, inbytes(methodOopDesc::interpreterentryoffset()), G3scratch); __ brnull(G3scratch, false, _ pt, skipfixup);_ It's a little tweaky but it would allow cbcond to be used in more places. I will file RFE for that.
You need some comments explaining when the instruction emits the delay slot for you. I would think this should only be done for names which are clearly not real instruction names. I don't like the extra argument that controls whether the delay slot is emitted or not. It's a bit mysterious in a casual reading. Maybe the false version should be nodelay? I'm not sure how I'd want to this work but it seems a bit subtle at a first glance. Originally I thought to add second version of macroassembler branch instructions with suffix short. Then old instructions will stay the same and short will generate cbcond or branch with delayed nop. But then I decided to go with passing additional flag. But I see what you mean, may be I should go to the original idea. I think being a little more explicit is better. Get rid of the emitdelayednop stuff in ba. It's just a confusing way to emit "br(always, false, pt, L)" so callers should just use that directly. Maybe instead of being called ba is should bralways so it's clear that it's a macro instruction. You can then leave the existing ba alone. I am also using emitdelayednop flag to control when cbcond could be emited. For the example bellow *updatedone is in far distance but it is forward branch and during ba() generation the distance is unknown. In such case the distance set to 0 and cbcond will be emitted. I will leave the original ba and add new bashort. Oh I see. Yes, a bashort to parallel jccb would make sense. Is there benefit to a short form in other cases? cbc doesn't match the docs which use cbcond to refer to them generically. It's also slightly odd that the real instructions are called cwb and cxb but we don't have any way to emit those directly but I guess the generic forms take care of it. I'm not a huge fan of the instructions that take CC as an argument. I might prefer to see cwb, cxb and cbptr to cover the ptrcc case but given the limited number of uses I can live with what you've done. I will rename cbc to cbcond. Why did you remove the Condition argument from brzero? It can work with the new encoding. I think it is confusing to have name brzero and have conditions. And in most cases it was indeed check for zero. With condition we need different name, I think: cmpzeroandbr? I guess it was confusingly named. cmpzeroandbr sounds good. c1LIRAssemblersparc.cpp: Why isn't this: ! _ ba(false, *updatedone); ! _ ba(*updatedone, false); __ delayed()->nop();_ simply __ ba(updatedone);_ That actually applies to a bunch of other places too. I answered it above: emitdelayednop flag controls when cbcond is emited. sharedRuntimesparc.cpp: I think the annulling here is wrong or at least meaningless. ! _ cmpandbrx(tempreg, G5inlinecachereg, Assembler::equal, true, Assembler::pt, L); __ brx(Assembler::equal, true, Assembler::pt, L);_ __ delayed()->nop();_ Why? This code trys to not execute nop if branch is not taken. Annulling doesn't skip the execution, it just keeps the side effects from occurring. In either case it's meaningless when applied to a nop. It was probably just a copy/paste from some other place where it was meaningful. tom cmpandbrx shouldn't even have an annul argument. Actually none of these new macro instructions should. Agree. Thanks, Vladimir compile.cpp: + if (n->isBranch() && n->asMach()->idealOpcode() != OpJump) { + // Fake label for branch instruction. + MacroAssembler masm(&buf); + Label fakeL; + masm.bind(fakeL); + n->asMach()->labelset(fakeL, 0); + } What guarantees do we get about the lifetime of the storage for fakeL since it doesn't appear to be used? I would assume that at a minimum it should be declared outside of the if. _You can clean up all the "l ? (l->locpos() - (cbuf.instssize() + 1)) : 0;" code in x86.ad after this fix too. We can even use the MacroAssembler for these instrutions now if we want to though that's probably for another day._ Should labelOper::label have an assert that it's non-null when called? That would cover all the other asserts you added though I'm not sure it must be true. The overall change looks good. tom On Jul 13, 2011, at 8:44 PM, Vladimir Kozlov wrote: http://cr.openjdk.java.net/~kvn/7063628/webrev
Fixed 7063628: Use cbcond on T4 Added new MacroAssembler instructions cmpandbr() and old branch instructions are modified to use cbcond on T4 if distance is small (2K bytes). Most of the rest changes are usage of these new branch instructions in Interpreter and C1. The prototype was done by Tom and I took some of his additional fixes. The formssel.cpp change is a bug fix where the MatchNode equality test wasn't recursing so it would mistakenly return true for complex matches. Added a fake label for branches generated in temp buffer by MachNode::emitsize(), added assert into .ad files to check label. There was problem in isinwdisp16range() (and in initial implementation of usecbc()) which calls target(L) before emitting branch. Non-bound (forward branch) labels record current pc() as a branch address and later try to patch instruction (which could be 'cmp') at that address. Removed unused code in checkklasssubtypefastpath(). Modified vmversion string: has prefix is removed and only v8 or v9 will be printed as well only niagaraplus or niagara. Tested on T4 with CTW, nsk, VM and java invoke regression tests.
- Previous message: Request for reviews (L): 7063628: Use cbcond on T4
- Next message: Request for reviews (S): 7067288: compiler regression test Test7052494 timeouts with client VM
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the hotspot-compiler-dev mailing list