Uros Bizjak - Re: [PATCH, i386]: Fix PR target/30970, take 2 (original) (raw)
This is the mail archive of the gcc-patches@gcc.gnu.orgmailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
- From: Uros Bizjak
- To: Ian Lance Taylor
- Cc: GCC Patches , Richard Henderson
- Date: Tue, 27 Feb 2007 20:02:07 +0100
- Subject: Re: [PATCH, i386]: Fix PR target/30970, take 2
- Dkim-signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:user-agent:mime-version:to:cc:subject:references:in-reply-to:content-type:content-transfer-encoding; b=NGImwFCos+A1XnzZ4zW7y7Xk9xJYxwB5cvhJ6mpAHvlQ88icDGPFSoiYLFH1nWapdm7qKEg+07FZWJPmFVpHzCX/qjIP1ADVoflIx7Pjpw8F+JXtl27f1Njmi3jQh8j6memm4fX5pX918gnaAStRc2+VowvBgl+jVMvaMKcG/+s=
- References: <45E42C83.1000602@gmail.com> <m3bqjfd47d.fsf@localhost.localdomain>
Ian Lance Taylor wrote:
The addition to this patch is corrected MODES_TIEABLE_P functionality
for i386 targets. The problem is, that lower-subreg pass checks
MODES_TIEABLE_P if the XMM register can be splitted into word_mode
(DImode) _without_copying_.Anyhow, your patch seems OK to me, though I wonder whether you will
get worse register allocation in some cases for code which extracts
the first float from a vector of floats. Of course those cases are
not very common.
Extracting float from vecfloat or int from vecint without going through memory never worked for i386. One of possible reasons for float case could be that movss reg,reg doesn't clear top 3 elements, but _mm_store_ss() simply ignores this, as shown in following testcase:
--cut here-- #include <xmmintrin.h>
float test (__m128 x) { float a; _mm_store_ss (&a, x);
return a + 1.0; } --cut here--
compiles to:
test: .LFB509: addss .LC0(%rip), %xmm0 ret
For int case, we can use movd, but gcc doesn't generate it with or without the patch.This is the testcase:
--cut here-- typedef float __v4sf attribute ((vector_size (16)));
float testf(__v4sf x, __v4sf y) { union { __v4sf v; float f[4]; } u;
u.v = x + y; return u.f[0]; }
typedef int __v4si __attribute__ ((vector_size (16)));
int testi(__v4si x, __v4si y) { union { __v4si v; int i[4]; } u;
u.v = x + y; return u.i[0]; } --cut here--
The result:
testf: .LFB2: addps %xmm1, %xmm0 movaps %xmm0, -24(%rsp) movss -24(%rsp), %xmm0 ret
testi: .LFB3: paddd %xmm1, %xmm0 movaps %xmm0, -24(%rsp) movl -24(%rsp), %eax ret
(Using _mm_store_ss() intrinsic always produces equivalent code).To fix this, MODES_TIEABLE_P would need to be revisited. I'll open a bugreport for this enhancement.Uros.
- Follow-Ups:
- Re: [PATCH, i386]: Fix PR target/30970, take 2
* From: Andrew Pinski
- Re: [PATCH, i386]: Fix PR target/30970, take 2
- References:
- [PATCH, i386]: Fix PR target/30970, take 2
* From: Uros Bizjak - Re: [PATCH, i386]: Fix PR target/30970, take 2
* From: Ian Lance Taylor
- [PATCH, i386]: Fix PR target/30970, take 2
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |