OpenMP Support — Clang 22.0.0git documentation (original) (raw)

Clang fully supports OpenMP 4.5, almost all of 5.0 and most of 5.1/2. Clang supports offloading to X86_64, AArch64, PPC64[LE], NVIDIA GPUs (all models) and AMD GPUs (all models).

In addition, the LLVM OpenMP runtime libomp supports the OpenMP Tools Interface (OMPT) on x86, x86_64, AArch64, and PPC64 on Linux, Windows, and macOS. OMPT is also supported for NVIDIA and AMD GPUs.

For the list of supported features from OpenMP 5.0 and 5.1 see OpenMP implementation details and OpenMP 51 implementation details.

General improvements

GPU devices support

Data-sharing modes

Clang supports two data-sharing models for Cuda devices: Generic and Cudamodes. The default mode is Generic. Cuda mode can give an additional performance and can be activated using the -fopenmp-cuda-mode flag. InGeneric mode all local variables that can be shared in the parallel regions are stored in the global memory. In Cuda mode local variables are not shared between the threads and it is user responsibility to share the required data between the threads in the parallel regions. Often, the optimizer is able to reduce the cost of Generic mode to the level of Cuda mode, but the flag, as well as other assumption flags, can be used for tuning.

Features not supported or with limited support for Cuda devices

OpenMP 5.0 Implementation Details

The following table provides a quick overview over various OpenMP 5.0 features and their implementation status. Please post on theDiscourse forums (Runtimes - OpenMP category) for more information or if you want to help with the implementation.

Category Feature Status Reviews
loop support != in the canonical loop form done D54441
loop #pragma omp loop (directive) partial D145823 (combined forms)
loop #pragma omp loop bind worked on D144634 (needs review)
loop collapse imperfectly nested loop done
loop collapse non-rectangular nested loop done
loop C++ range-base for loop done
loop clause: if for SIMD directives done
loop inclusive scan (matching C++17 PSTL) done
memory management memory allocators done r341687,r357929
memory management allocate directive and allocate clause done r355614,r335952
OMPD OMPD interfaces done https://reviews.llvm.org/D99914 (Supports only HOST(CPU) and Linux
OMPT OMPT interfaces (callback support) done
thread affinity thread affinity done
task taskloop reduction done
task task affinity not upstream https://github.com/jklinkenberg/openmp/tree/task-affinity
task clause: depend on the taskwait construct done D113540 (regular codegen only)
task depend objects and detachable tasks done
task mutexinoutset dependence-type for tasks done D53380,D57576
task combined taskloop constructs done
task master taskloop done
task parallel master taskloop done
task master taskloop simd done
task parallel master taskloop simd done
SIMD atomic and simd constructs inside SIMD code done
SIMD SIMD nontemporal done
device infer target functions from initializers worked on
device infer target variables from initializers done D146418
device OMP_TARGET_OFFLOAD environment variable done D50522
device support full ‘defaultmap’ functionality done D69204
device device specific functions done
device clause: device_type done
device clause: extended device done
device clause: uses_allocators clause done https://github.com/llvm/llvm-project/pull/157025
device clause: in_reduction worked on r308768
device omp_get_device_num() done D54342,D128347
device structure mapping of references unclaimed
device nested target declare done D51378
device implicitly map ‘this’ (this[:1]) done D55982
device allow access to the reference count (omp_target_is_present) done
device requires directive done
device clause: unified_shared_memory done D52625,D52359
device clause: unified_address partial
device clause: reverse_offload partial D52780,D155003
device clause: atomic_default_mem_order done D53513
device clause: dynamic_allocators unclaimed parts D53079
device user-defined mappers done D56326,D58638,D58523,D58074,D60972,D59474
device map array-section with implicit mapper done https://github.com/llvm/llvm-project/pull/101101
device mapping lambda expression done D51107
device clause: use_device_addr for target data done
device support close modifier on map clause done D55719,D55892
device teams construct on the host device done r371553
device support non-contiguous array sections for target update done https://github.com/llvm/llvm-project/pull/144635
device pointer attachment being repaired @abhinavgaba (https://github.com/llvm/llvm-project/pull/153683)
atomic hints for the atomic construct done D51233
base language C11 support done
base language C++11/14/17 support done
base language lambda support done
misc array shaping done D74144
misc library shutdown (omp_pause_resource[_all]) done D55078
misc metadirectives mostly done D91944, https://github.com/llvm/llvm-project/pull/128640
misc conditional modifier for lastprivate clause done
misc iterator and multidependences done
misc depobj directive and depobj dependency kind done
misc user-defined function variants done. D67294, D64095, D71847, D71830, D109635
misc pointer/reference to pointer based array reductions done
misc prevent new type definitions in clauses done
memory model memory model update (seq_cst, acq_rel, release, acquire,…) done

OpenMP 5.1 Implementation Details

The following table provides a quick overview over various OpenMP 5.1 features and their implementation status. Please post on theDiscourse forums (Runtimes - OpenMP category) for more information or if you want to help with the implementation.

Category Feature Status Reviews
atomic ‘compare’ clause on atomic construct done D120290, D120007, D118632, D120200, D116261, D118547, D116637
atomic ‘fail’ clause on atomic construct worked on D123235 (in progress)
base language C++ attribute specifier syntax done D105648
device ‘present’ map type modifier done D83061, D83062, D84422
device ‘present’ motion modifier done D84711, D84712
device ‘present’ in defaultmap clause done D92427
device map clause reordering based on ‘present’ modifier unclaimed
device device-specific environment variables unclaimed
device omp_target_is_accessible routine done https://github.com/llvm/llvm-project/pull/138294
device omp_get_mapped_ptr routine done D141545
device new async target memory copy routines done D136103
device thread_limit clause on target construct partial D141540 (offload), D152054 (host, in progress)
device has_device_addr clause on target construct unclaimed
device iterators in map clause or motion clauses done https://github.com/llvm/llvm-project/pull/159112
device indirect clause on declare target directive In Progress
device allow virtual functions calls for mapped object on device partial
device interop construct partial parsing/sema done: D98558, D98834, D98815
device assorted routines for querying interoperable properties partial D106674
loop Loop tiling transformation done D76342
loop Loop unrolling transformation done D99459
loop ‘reproducible’/’unconstrained’ modifiers in ‘order’ clause partial D127855
memory management alignment for allocate directive and clause done D115683
memory management ‘allocator’ modifier for allocate clause done https://github.com/llvm/llvm-project/pull/114883
memory management ‘align’ modifier for allocate clause done https://github.com/llvm/llvm-project/pull/121814
memory management new memory management routines unclaimed
memory management changes to omp_alloctrait_key enum unclaimed
memory model seq_cst clause on flush construct done https://github.com/llvm/llvm-project/pull/114072
misc ‘omp_all_memory’ keyword and use in ‘depend’ clause done D125828, D126321
misc error directive done D139166
misc scope construct done D157933, https://github.com/llvm/llvm-project/pull/109197
misc routines for controlling and querying team regions partial D95003 (libomp only)
misc changes to ompt_scope_endpoint_t enum unclaimed
misc omp_display_env routine done D74956
misc extended OMP_PLACES syntax unclaimed
misc OMP_NUM_TEAMS and OMP_TEAMS_THREAD_LIMIT env vars done D138769
misc ‘target_device’ selector in context specifier worked on
misc begin/end declare variant done D71179
misc dispatch construct and function variant argument adjustment worked on D99537, D99679
misc assumes directives worked on
misc assume directive done
misc nothing directive done D123286
misc masked construct and related combined constructs done D99995, D100514, PR-121741(parallel_masked_taskloop) PR-121746(parallel_masked_task_loop_simd),PR-121914(masked_taskloop) PR-121916(masked_taskloop_simd)
misc default(firstprivate) & default(private) done D75591 (firstprivate), D125912 (private)
other deprecating master construct unclaimed
OMPT new barrier types added to ompt_sync_region_t enum unclaimed
OMPT async data transfers added to ompt_target_data_op_t enum unclaimed
OMPT new barrier state values added to ompt_state_t enum unclaimed
OMPT new ‘emi’ callbacks for external monitoring interfaces done
OMPT device tracing interface in progress jplehr
task ‘strict’ modifier for taskloop construct unclaimed
task inoutset in depend clause done D97085, D118383
task nowait clause on taskwait partial parsing/sema done: D131830, D141531

OpenMP 5.2 Implementation Details

The following table provides a quick overview of various OpenMP 5.2 features and their implementation status. Please post on theDiscourse forums (Runtimes - OpenMP category) for more information or if you want to help with the implementation.

Feature C/C++ Status Fortran Status Reviews
omp_in_explicit_task() unclaimed unclaimed
semantics of explicit_task_var and implicit_task_var unclaimed unclaimed
ompx sentinel for C/C++ directive extensions unclaimed unclaimed
ompx prefix for clause extensions unclaimed unclaimed
if clause on teams construct unclaimed unclaimed
step modifier added unclaimed unclaimed
declare mapper: Add iterator modifier on map clause unclaimed unclaimed
declare mapper: Add iterator modifier on map clause unclaimed unclaimed
memspace and traits modifiers to uses allocator i unclaimed unclaimed
Add otherwise clause to metadirectives unclaimed unclaimed
doacross clause with support for omp_cur_iteration unclaimed unclaimed
position of interop_type in init clause on iterop unclaimed unclaimed
implicit map type for target enter/exit data unclaimed unclaimed
work OMPT type for work-sharing loop constructs unclaimed unclaimed
allocate and firstprivate on scope directive unclaimed unclaimed
Change loop consistency for order clause unclaimed unclaimed
Add memspace and traits modifiers to uses_allocators unclaimed unclaimed
Keep original base pointer on map w/o matched candidate unclaimed unclaimed
Pure procedure support for certain directives N/A unclaimed
ALLOCATE statement support for allocators N/A unclaimed
dispatch construct extension to support end directive N/A unclaimed

OpenMP 5.2 Deprecations

C/C++ Status Fortran Status Reviews
Linear clause syntax unclaimed unclaimed
The minus operator unclaimed unclaimed
Map clause modifiers without commas unclaimed unclaimed
The use of allocate directives with ALLOCATE statement N/A unclaimed
uses_allocators list syntax unclaimed unclaimed
The default clause on metadirectives unclaimed unclaimed
The delimited form of the declare target directive unclaimed N/A
The use of the to clause on the declare target directive unclaimed unclaimed
The syntax of the destroy clause on the depobj construct unclaimed unclaimed
keyword source and sink as task-dependence modifiers unclaimed unclaimed
interop types in any position on init clause of interop unclaimed unclaimed
ompd prefix usage for some ICVs unclaimed unclaimed

OpenMP 6.0 Implementation Details

The following table provides a quick overview of various OpenMP 6.0 features and their implementation status. Please post on theDiscourse forums (Runtimes - OpenMP category) for more information or if you want to help with the implementation.

Feature C/C++ Status Fortran Status Reviews
free-agent threads unclaimed unclaimed
threadset clause partial unclaimed Parse/Sema/Codegen : https://github.com/llvm/llvm-project/pull/13580
Recording of task graphs in progress in progress clang: jtb20, flang: kparzysz
Parallel inductions unclaimed unclaimed
init_complete for scan directive unclaimed unclaimed
loop interchange transformation done unclaimed Clang (interchange): https://github.com/llvm/llvm-project/pull/93022Clang (permutation): https://github.com/llvm/llvm-project/pull/92030
loop reverse transformation done unclaimed https://github.com/llvm/llvm-project/pull/92916
loop stripe transformation done unclaimed https://github.com/llvm/llvm-project/pull/119891
loop fusion transformation in progress unclaimed https://github.com/llvm/llvm-project/pull/139293
loop index set splitting transformation unclaimed unclaimed
loop transformation apply clause unclaimed unclaimed
loop fuse transformation done unclaimed
workdistribute construct in progress @skc7, @mjklemm
task_iteration unclaimed unclaimed
memscope clause for atomic and flush unclaimed unclaimed
transparent clause (hull tasks) unclaimed unclaimed
rule-based compound directives In Progress In Progress kparzysz Testing for Fortran missing
C23, C++23 unclaimed
Fortran 2023 unclaimed
decl attribute for declarative directives unclaimed unclaimed
C attribute syntax unclaimed
pure directives in DO CONCURRENT unclaimed
Optional argument for all clauses partial In Progress Parse/Sema (nowait): https://github.com/llvm/llvm-project/pull/159628
Function references for locator list items unclaimed unclaimed
All clauses accept directive name modifier unclaimed unclaimed
Extensions to depobj construct unclaimed unclaimed
Extensions to atomic construct unclaimed unclaimed
Private reductions mostly unclaimed Parse/Sema:https://github.com/llvm/llvm-project/pull/129938 Codegen: https://github.com/llvm/llvm-project/pull/134709
Self maps partial unclaimed parsing/sema done: https://github.com/llvm/llvm-project/pull/129888
Release map type for declare mapper unclaimed unclaimed
Extensions to interop construct unclaimed unclaimed
no_openmp_constructs done unclaimed https://github.com/llvm/llvm-project/pull/125933
safe_sync and progress with identifier and API unclaimed unclaimed
OpenMP directives in concurrent loop regions done unclaimed https://github.com/llvm/llvm-project/pull/125621
atomics constructs on concurrent loop regions done unclaimed https://github.com/llvm/llvm-project/pull/125621
Loop construct with DO CONCURRENT In Progress
device_type clause for target construct unclaimed unclaimed
nowait for ancestor target directives unclaimed unclaimed
New API for devices’ num_teams/thread_limit unclaimed unclaimed
Host and device environment variables unclaimed unclaimed
num_threads ICV and clause accepts list unclaimed unclaimed
Numeric names for environment variables unclaimed unclaimed
Increment between places for OMP_PLACES unclaimed unclaimed
OMP_AVAILABLE_DEVICES envirable unclaimed unclaimed (should wait for “Traits for default device envirable” being done)
Traits for default device envirable in progress unclaimed ro-i
Optionally omit array length expression done unclaimed (Parse) https://github.com/llvm/llvm-project/pull/148048, (Sema) https://github.com/llvm/llvm-project/pull/152786
Canonical loop sequences in progress in progress Clang: https://github.com/llvm/llvm-project/pull/139293
Clarifications to Fortran map semantics unclaimed unclaimed
default clause at target construct done unclaimed https://github.com/llvm/llvm-project/pull/162910
ref count update use_device_{ptr, addr} unclaimed unclaimed
Clarifications to implicit reductions unclaimed unclaimed
ref modifier for map clauses In Progress unclaimed
map-type modifiers in arbitrary position done unclaimed https://github.com/llvm/llvm-project/pull/90499
Lift nesting restriction on concurrent loop done unclaimed https://github.com/llvm/llvm-project/pull/125621
priority clause for target constructs unclaimed unclaimed
changes to target_data construct unclaimed unclaimed
Non-const do_not_sync for nowait/nogroup unclaimed unclaimed
need_device_addr modifier for adjust_args clause partial unclaimed Parsing/Sema: https://github.com/llvm/llvm-project/pull/143442https://github.com/llvm/llvm-project/pull/149586
need_device_ptr modifier for adjust_args clause unclaimed unclaimed
Prescriptive num_threads done unclaimed https://github.com/llvm/llvm-project/pull/160659 https://github.com/llvm/llvm-project/pull/146403 https://github.com/llvm/llvm-project/pull/146404 https://github.com/llvm/llvm-project/pull/146405
Message and severity clauses done unclaimed https://github.com/llvm/llvm-project/pull/146093
Local clause on declare target In Progress unclaimed
groupprivate directive In Progress partial Flang: kparzysz, mjklemm Flang parser: https://github.com/llvm/llvm-project/pull/153807Flang sema: https://github.com/llvm/llvm-project/pull/154779Clang parse/sema: https://github.com/llvm/llvm-project/pull/158134
variable-category on default clause done unclaimed
Changes to omp_target_is_accessible In Progress In Progress
defaultmap implicit-behavior ‘storage’ done unclaimed https://github.com/llvm/llvm-project/pull/158336
defaultmap implicit-behavior ‘private’ done unclaimed https://github.com/llvm/llvm-project/pull/158712

OpenMP 6.1 Implementation Details (Experimental)

The following table provides a quick overview over various OpenMP 6.1 features and their implementation status. Since OpenMP 6.1 has not yet been released, the following features are experimental and are subject to change at any time. Please post on the Discourse forums (Runtimes - OpenMP category) for more information or if you want to help with the implementation.

Feature C/C++ Status Fortran Status Reviews
dyn_groupprivate clause In Progress In Progress C/C++: kevinsala (https://github.com/llvm/llvm-project/pull/152651 https://github.com/llvm/llvm-project/pull/152830 https://github.com/llvm/llvm-project/pull/152831)
loop flatten transformation unclaimed unclaimed
loop grid/tile modifiers for sizes clause unclaimed unclaimed
attach map-type modifier In Progress unclaimed C/C++: @abhinavgaba; RT: @abhinavgaba (https://github.com/llvm/llvm-project/pull/149036,https://github.com/llvm/llvm-project/pull/158370)
need_device_ptr modifier for adjust_args clause partial unclaimed Clang Parsing/Sema: https://github.com/llvm/llvm-project/pull/168905 https://github.com/llvm/llvm-project/pull/169558

OpenMP Extensions

The following table provides a quick overview over various OpenMP extensions and their implementation status. These extensions are not currently defined by any standard, so links to associated LLVM documentation are provided. As these extensions mature, they will be considered for standardization. Please post on theDiscourse forums (Runtimes - OpenMP category) to provide feedback.

Category Feature Status Reviews
atomic extension ‘atomic’ strictly nested within ‘teams’ prototyped D126323
device extension ‘ompx_hold’ map type modifier prototyped D106509, D106510
device extension ‘ompx_bare’ clause on ‘target teams’ construct prototyped #66844, #70612
device extension Multi-dim ‘num_teams’ and ‘thread_limit’ clause on ‘target teams ompx_bare’ construct partial #99732, #101407, #102715