[OMP][CodeGen] OMP Scan CodeGen is broken (original) (raw)
Hello,
It seems that CodeGen for OMP Scan directives is pretty broken (from the start). There is an open issue on github: [clang][OpenMP] omp scan code gen does not initialize allocas · Issue #87466 · llvm/llvm-project · GitHub
The problem (or one of the problems) manifests in cases where #pragma omp parallel is separated from #pragma omp for reduction , while the generated code is fine for #pragma omp parallel for reduction .
Please find the failing test source which works with GCC.
#pragma omp parallel
{
#pragma omp for reduction(inscan, +: sum)
for (int i = 0; i < n; ++i) {
output[i] = sum;
#pragma omp scan exclusive(sum)
sum += input[i];
}
}
IMO, the problem is in generation of declarations for OMP scan temporary variables. The temporary buffer declaration is tied to worksharing for region and it should be tied to omp parallel region. This makes the buffer thread private and it should be shared.
Example how generated code looks like: Compiler Explorer - C++
Example how generated code should look like: Compiler Explorer - C++
Additionally, the clang crashes on assertion in most of the OMP Scan tests I’ve used. It seems like the assertion is related to the variable declaration, but I am not sure it is the same bug.
Does anyone have an idea how to fix this?
Thanks,
Nikola
There are 2 ways to fix it.
- Declare the buffer in sema and capture it in all parent task-based regions.
- Allocate global threadprivate buffer and work with it.