Canqun Yang - Re: IA64 division code patch with data flow bug in -fweb optimization (original) (raw)

This is the mail archive of the gcc-patches@gcc.gnu.orgmailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Hi, Steve

It works, and I got more than 4% speedups from my scientific benchmarks. Thank you very much. For furture work, the extended double precision version (XFmode) is also expected, I think.

Canqun

--- Steve Ellcey sje@cup.hp.com:

I am very instrested in this new implementation of division expanding. Would you please tell me how to describe the dependence?

Regards

Canqun

I just submitted a formal patch to gcc-patches with this fix in it. Basically, I just changed the recip_approx_rf instruction from having:

(set (match_operand:XF 0 "fr_register_operand" "=f") (div:RF (const_int 1) (match_operand:RF 3 "fr_register_operand" "f"))) to:

(set (match_operand:RF 0 "fr_register_operand" "=f") (div:RF (match_operand:RF 2 "fr_register_operand" "f") (match_operand:RF 3 "fr_register_operand" "f")))

Neither of these is exactly right since we set op0 to 1/op3 if the result of op2/op3 wouldn't be inf or nan or zero and to op2/op3 when the result is going to be inf or nan or zero but the important part is that op2 shows up as an input to the instruction.

Steve Ellcey sje@cup.hp.com


抢注雅虎免费邮箱-3.5G容量,20M附件! http://cn.mail.yahoo.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]