mm512_shrdv* intrinsics have incorrect argument order (original) (raw)
I tried this code:
#![feature(stdarch_x86_avx512)]
use std::arch::x86_64::*;
fn main() { unsafe { let a = _mm512_set1_epi32(0xffff); let b = _mm512_setzero_epi32(); let c = _mm512_set1_epi32(1);
let dst = _mm512_shrdv_epi32(a, b, c);
println!("{}", _mm512_cvtsi512_si32(dst));
}}
I expected to see this happen:
The code produces the same output as the equivalent C program:
#include <immintrin.h> #include <stdio.h>
int main() { __m512i a = _mm512_set1_epi32(0xffff); __m512i b = _mm512_setzero_epi32(); __m512i c = _mm512_set1_epi32(1);
__m512i dst = _mm512_shrdv_epi32(a, b, c);
printf("%u\n", _mm512_cvtsi512_si32(dst));}
The program outputs 32767.
Instead, the Rust program outputs -2147483648.
Intel's documentation (as linked in the rustdoc for the function) for _mm512_shrdv_epi32 states:
Concatenate packed 32-bit integers in b and a producing an intermediate 64-bit result. Shift the result right by the amount specified in the corresponding element of c, and store the lower 32-bits in dst.
FOR j := 0 to 15 i := j*32 dst[i+31:i] := ((b[i+31:i] << 32)[63:0] | a[i+31:i]) >> (c[i+31:i] & 31) ENDFOR dst[MAX:512] := 0
meaning argument b is the upper bits, and a is the lower bits. However, llvm.fshr.* uses the opposite order. It appears Rust is passing arguments a, b, and c in that order to llvm.fshr:
This likely also applies to all similar intrinsics that call llvm.fshr.
Meta
rustc --version --verbose:
rustc 1.83.0-nightly (0609062a9 2024-09-13)
binary: rustc
commit-hash: 0609062a91c8f445c3e9a0de57e402f9b1b8b0a7
commit-date: 2024-09-13
host: x86_64-unknown-linux-gnu
release: 1.83.0-nightly
LLVM version: 19.1.0