Dorit Nuzman - Re: [PATCH][RFC] Make the function vectorizer capable of doing type tr (original) (raw)
This is the mail archive of the gcc-patches@gcc.gnu.orgmailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
- From: Dorit Nuzman
- To: Richard Guenther
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Thu, 1 Feb 2007 03:57:22 +0200
- Subject: Re: [PATCH][RFC] Make the function vectorizer capable of doing type transformations
Richard Guenther rguenther@suse.de wrote on 31/01/2007 12:52:58:
...
sure, I have a bunch of testcases that I did not include in the patch yet.
cool
A few small questions/comments:
nargs++;
if (nargs >= 2)
- return false;
}
any inherent problem behind this check, or just restricting (FORNOW?) to the certain function-calls you expect to see? (which is fine, just wondering)
It's laziness - but also I don't expect vectorizable calls with more than 2 parameters (I'll add a comment clarifying that).
thanks
Note that all the analysis stuff needs to go to vectorizable_function () (or I rather am going to merge vectorizable_function and vectorizable_call).
ok
case BUILT_IN_LRINT:
if (out_mode == SImode && out_n == 2
&& in_mode == DFmode && in_n == 2)
- return ix86_builtins[IX86_BUILTIN_CVTPD2PI];
return NULL_TREE;
(I assume you'll have a testcase for each of those?)
Only the BUILT_IN_LRINTF case on i?86 will ever trigger - with for example cvtpd2pi which converts to a sse1 register, the vectorizer does not consider using the V2SI sse1 vector type for the result so we have the same problem as with cvtpd2dq. I'll leave the cases that don't trigger out for now - they were in for testing, but I didn't manage to get the vectorizer use V2SI ;)
ok (is that because the restriction of considering only one vector size?)
- /* Only handle the case of vectors with the same number of elements.
FIXME: We need a way to handle for example the SSE2 cvtpd2dq
instruction which converts V2DFmode to V4SImode but only
using the lower half of the V4SImode result. */
- if (TYPE_VECTOR_SUBPARTS (vectype_in) != TYPE_VECTOR_SUBPARTS (vectype_out))
yes. this requires similar functionality to the one that vectorizes v2di->v4si in vectorizable_demotion, expect we need a different idiom instead of the vec_pack/unpack to "convert-and-unpack" 4 doubles (organized in 2 regs) into 4 ints (some target hook maybe?).
We also need to somehow tell the vectorizer that the function call we want to vectorize needs this. Or it might be able to tell by itself seeing a V2DF -> V4SI conversion - I'll look into vec_pack/unpack to see if I can teach the function vectorizer to do it magically.
I think it should be able to tell by itself. let me know if you have problem deciphering the demotion/promotion code.
One other problem is that on x86_64 long is 64bits, so the prototypes for lrint would require V2DF -> V2DI conversion which is also not available (there's only the scalar variant DF -> DI). But I guess that's better handled by earlier recognizing the case we have (int)lrint(x) and converting this to an internal si_lrint(x) call.
I see. I wonder what other targets have
Could you please also add a testcase for this (with xfail?)
Yes, I'll do that.
thanks!
dorit
Thanks, Richard.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |