Add SIMD funnel shift and round-to-even intrinsics by sayantn · Pull Request #142078 · rust-lang/rust (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation40 Commits2 Checks9 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

@sayantn

This PR adds 3 new SIMD intrinsics

TODO (future PR): implement simd_fsh{l,r} in miri, cg_gcc and cg_clif (it is surprisingly hard to implement without branches, the common tricks that rotate uses doesn't work because we have 2 elements now. e.g, the -n&31 trick used by cg_gcc to implement rotate doesn't work with this because then fshl(a, b, 0) will be a | b)

#t-compiler > More SIMD intrinsics

@rustbot label T-compiler T-libs A-intrinsics F-core_intrinsics
r? @workingjubilee

@rustbot

r? @wesleywiser

rustbot has assigned @wesleywiser.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-LLVM

Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.

S-waiting-on-review

Status: Awaiting review from the assignee but also interested parties.

T-compiler

Relevant to the compiler team, which will review and decide on the PR/issue.

T-libs

Relevant to the library team, which will review and decide on the PR/issue.

labels

Jun 5, 2025

@rustbot

folkertdev

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you'd also need to implement this operation in MIRI.

@RalfJung

TODO: implement these in cg_gcc and cg_clif

Please add Miri to the list. :) (Doesn't have to be in the first PR.)

@folkertdev

Decide: Should we allow signed integers in simd_fsh{l,r}?

Both s390x and avx512 do (want to) use the funnel shifts on signed types. We can use transmutes there of course, but I'd say we may as well also support signed types. It's already documented that this is a logical shift (i.e. that it always shifts in zeros)

@antoyo

@antoyo @bjorn3 some help about implementing fsh{l,r} will be appreciated

You should be able to have more or less the same implementation for cg_gcc by putting the code in this file. The complicated part is to find the equivalent intrinsic names for GCC.

workingjubilee

@workingjubilee

Decide: Should we add fshr and fshl to normal int types as well? They seem useful enough (cc @rust-lang/libs-api)

If you want to block this PR on submitting a T-libs-api ACP, you can do that, but I don't see why you would.

@sayantn

@antoyo @bjorn3 some help about implementing fsh{l,r} will be appreciated

You should be able to have more or less the same implementation for cg_gcc by putting the code in this file. The complicated part is to find the equivalent intrinsic names for GCC.

I only found the intrinsic name for x86 as __builtin_ia32_vpsh{l,r}dv_vNiM (from your implementation of llvm.fsh{l,r}), should I go ahead and add it, or do I need to check for other archs as well?.

Decide: Should we add fshr and fshl to normal int types as well? They seem useful enough (cc @rust-lang/libs-api)

If you want to block this PR on submitting a T-libs-api ACP, you can do that, but I don't see why you would.

I don't think this should blocked on adding these to int types, just thought this might be a nice place to discuss it. I will go ahead and submit an independent ACP for that

@antoyo

I only found the intrinsic name for x86 as __builtin_ia32_vpsh{l,r}dv_vNiM (from your implementation of llvm.fsh{l,r}), should I go ahead and add it, or do I need to check for other archs as well?.

It seems GCC doesn't have an arch-independent builtin for this, so I guess it's better for this PR to not implement the intrinsic for cg_gcc.
I'll eventually come up with a general solution for this kind of issues.

@sayantn

@rustbot

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

@sayantn

I have implemented simd_round_ties_even in miri, cg_clif and cg_gcc (because it seemed to be easy), and allowed signed integer types for simd_fsh{l,r}

@rustbot

The Miri subtree was changed

cc @rust-lang/miri

RalfJung

workingjubilee

Comment on lines 144 to 161

pub unsafe fn simd_fshl<T>(a: T, b: T, shift: T) -> T;
/// Funnel Shifts vector right elementwise, with UB on overflow.
///
/// Concatenates `a` and `b` elementwise (with `a` in the most significant half),
/// creating a vector of the same length, but with each element being twice as
/// wide. Then shift this vector right elementwise by `shift`, shifting in zeros,
/// and extract the least significant half of each of the elements. If `a` and `b`
/// are the same, this is equivalent to an elementwise rotate right operation.
///
/// `T` must be a vector of integers.
///
/// # Safety
///
/// Each element of `shift` must be less than `::BITS`.
#[rustc_intrinsic]
#[rustc_nounwind]
pub unsafe fn simd_fshr<T>(a: T, b: T, shift: T) -> T;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I have been thinking and I can't get over the bit where the current mnemonic sounds like it has something to do with floats in my head. I think we might want to call out these names as simd_funnel_shl and simd_funnel_shr? I looked around and we get names like __funnelshift_lc and __funnelshift_rc for similar intrinsics (e.g. CUDA), and I know that some languages actually do support shifts on floats, with... sometimes interesting semantics.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, simd_funnel_sh{l,r} sounds nice enough (with the scalar functions as funnel_sh{l,r}). I will change the name

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the names. Btw @workingjubilee as you seem to be interested, should I r? you?

@sayantn sayantn changed the titleAdd simd_fsh{l,r} and simd_round_ties_even Add SIMD funnel shift and round-to-even intrinsics

Jun 6, 2025

@bors

@bors

@sayantn

@sayantn

@sayantn

@workingjubilee does this need any more changes, or is it ready to be merged now?

@workingjubilee

I think we're good.

@bors r+ rollup

@bors

📌 Commit 2ffa1dd has been approved by workingjubilee

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

and removed S-waiting-on-review

Status: Awaiting review from the assignee but also interested parties.

labels

Jun 16, 2025

@workingjubilee

@bors bors added S-waiting-on-author

Status: This is awaiting some action (such as code changes or more information) from the author.

and removed S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

labels

Jun 16, 2025

workingjubilee

Comment on lines 2306 to +2315

// Unary integer intrinsics
if matches!(
name,
sym::simd_bswap | sym::simd_bitreverse sym::simd_ctlz sym::simd_ctpop sym::simd_cttz
sym::simd_bswap
| sym::simd_bitreverse
| sym::simd_ctlz
| sym::simd_ctpop
| sym::simd_cttz
| sym::simd_funnel_shl
| sym::simd_funnel_shr

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

funnel shift isn't exactly unary, now is it? probably should update the comment and admit this is our junk drawer of "everything else" right now.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though on line 2388 there is an actual binary operation on SIMD integers

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I just thought it would be nice to bunch them with other integer intrinsics (like an integer equivalent of simd_simple_float_intrinsic). If you want I can also add simd_saturating_{add,sub} to this list, but the annoying thing is that they use InvalidMonomorphization::ExpectedVectorElementType (which is badly named imo) for some reason, where all others use InvalidMonomorphization::UnsupportedOperation, so it would change some ui test messages if I do it. Should I do it, or just change the comment for now?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of these intrinsic error types are poorly named -- feel free to adjust the names (and messages) if it makes sense. :) Just please don't spend too much complexity on this: stable code cannot even "see" these checks, so it's not worth having a lot of code to make them extra nice.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't want to do that in this PR. I will send another PR that will refactor intrinsic.rs. For now, @workingjubilee I have changed the comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. Hi!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I don't see the comment change but I don't want to delay this further because yes it should land before a refactor. ^^;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh it seems like I just forgot to push it 🤦‍♂️

@workingjubilee

@bors

📌 Commit 2ffa1dd has been approved by workingjubilee

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

and removed S-waiting-on-author

Status: This is awaiting some action (such as code changes or more information) from the author.

labels

Jun 27, 2025

bors added a commit that referenced this pull request

Jun 29, 2025

@bors

Rollup of 4 pull requests

Successful merges:

r? @ghost @rustbot modify labels: rollup

rust-timer added a commit that referenced this pull request

Jun 29, 2025

@rust-timer

Rollup merge of #142078 - sayantn:more-intrinsics, r=workingjubilee

Add SIMD funnel shift and round-to-even intrinsics

This PR adds 3 new SIMD intrinsics

TODO (future PR): implement simd_fsh{l,r} in miri, cg_gcc and cg_clif (it is surprisingly hard to implement without branches, the common tricks that rotate uses doesn't work because we have 2 elements now. e.g, the -n&31 trick used by cg_gcc to implement rotate doesn't work with this because then fshl(a, b, 0) will be a | b)

#t-compiler > More SIMD intrinsics

@rustbot label T-compiler T-libs A-intrinsics F-core_intrinsics r? @workingjubilee

RalfJung pushed a commit to RalfJung/miri that referenced this pull request

Jun 29, 2025

@bors

github-actions bot pushed a commit to model-checking/verify-rust-std that referenced this pull request

Jul 4, 2025

@GuillaumeGomez

…jubilee

Add SIMD funnel shift and round-to-even intrinsics

This PR adds 3 new SIMD intrinsics

TODO (future PR): implement simd_fsh{l,r} in miri, cg_gcc and cg_clif (it is surprisingly hard to implement without branches, the common tricks that rotate uses doesn't work because we have 2 elements now. e.g, the -n&31 trick used by cg_gcc to implement rotate doesn't work with this because then fshl(a, b, 0) will be a | b)

#t-compiler > More SIMD intrinsics

@rustbot label T-compiler T-libs A-intrinsics F-core_intrinsics r? @workingjubilee

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request

Jul 18, 2025

@GuillaumeGomez

…jubilee

Add SIMD funnel shift and round-to-even intrinsics

This PR adds 3 new SIMD intrinsics

TODO (future PR): implement simd_fsh{l,r} in miri, cg_gcc and cg_clif (it is surprisingly hard to implement without branches, the common tricks that rotate uses doesn't work because we have 2 elements now. e.g, the -n&31 trick used by cg_gcc to implement rotate doesn't work with this because then fshl(a, b, 0) will be a | b)

#t-compiler > More SIMD intrinsics

@rustbot label T-compiler T-libs A-intrinsics F-core_intrinsics r? @workingjubilee

bjorn3 pushed a commit to bjorn3/rust that referenced this pull request

Nov 8, 2025

@GuillaumeGomez

…jubilee

Add SIMD funnel shift and round-to-even intrinsics

This PR adds 3 new SIMD intrinsics

TODO (future PR): implement simd_fsh{l,r} in miri, cg_gcc and cg_clif (it is surprisingly hard to implement without branches, the common tricks that rotate uses doesn't work because we have 2 elements now. e.g, the -n&31 trick used by cg_gcc to implement rotate doesn't work with this because then fshl(a, b, 0) will be a | b)

#t-compiler > More SIMD intrinsics

@rustbot label T-compiler T-libs A-intrinsics F-core_intrinsics r? @workingjubilee

Labels

A-intrinsics

Area: Intrinsics

A-LLVM

Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.

F-core_intrinsics

Issue in the "core intrinsics" for internal usage only.

S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

T-compiler

Relevant to the compiler team, which will review and decide on the PR/issue.

T-libs

Relevant to the library team, which will review and decide on the PR/issue.