SYCL Compiler and Runtime architecture design — Clang 21.0.0git documentation (original) (raw)

Introduction

This document describes the architecture of the SYCL compiler and runtime library. More details are provided inexternal document, which are going to be added to clang documentation in the future.

Address space handling

The SYCL specification represents pointers to disjoint memory regions using C++ wrapper classes on an accelerator to enable compilation with a standard C++ toolchain and a SYCL compiler toolchain. Section 3.8.2 of SYCL 2020 specification definesmemory model, section 4.7.7 - address space classesand section 5.9 covers address space deduction. The SYCL specification allows two modes of address space deduction: “generic as default address space” (see section 5.9.3) and “inferred address space” (see section 5.9.4). Current implementation supports only “generic as default address space” mode.

SYCL borrows its memory model from OpenCL however SYCL doesn’t perform the address space qualifier inference as detailed inOpenCL C v3.0 6.7.8.

The default address space is “generic-memory”, which is a virtual address space that overlaps the global, local, and private address spaces. SYCL mode enables following conversions:

All named address spaces are disjoint and sub-sets of default address space.

The SPIR target allocates SYCL namespace scope variables in the global address space.

Pointers to default address space should get lowered into a pointer to a generic address space (or flat to reuse more general terminology). But depending on the allocation context, the default address space of a non-pointer type is assigned to a specific address space. This is described incommon address space deduction rulessection.

This is also in line with the behaviour of CUDA (small example).

multi_ptr class implementation example:

// check that SYCL mode is ON and we can use non-standard decorations #if defined(SYCL_DEVICE_ONLY) // GPU/accelerator implementation template <typename T, address_space AS> class multi_ptr { // DecoratedType applies corresponding address space attribute to the type T // DecoratedType<T, global_space>::type == "attribute((opencl_global)) T" // See sycl/include/CL/sycl/access/access.hpp for more details using pointer_t = typename DecoratedType<T, AS>::type *;

pointer_t m_Pointer; public: pointer_t get() { return m_Pointer; } T& operator* () { return *reinterpret_cast<T*>(m_Pointer); } } #else // CPU/host implementation template <typename T, address_space AS> class multi_ptr { T *m_Pointer; // regular undecorated pointer public: T get() { return m_Pointer; } T& operator () { return *m_Pointer; } } #endif

Depending on the compiler mode, multi_ptr will either decorate its internal data with the address space attribute or not.

To utilize clang’s existing functionality, we reuse the following OpenCL address space attributes for pointers:

Address space attribute SYCL address_space enumeration
__attribute__((opencl_global)) global_space, constant_space
__attribute__((opencl_global_device)) global_space
__attribute__((opencl_global_host)) global_space
__attribute__((opencl_local)) local_space
__attribute__((opencl_private)) private_space

//TODO: add support for attribute((opencl_global_host)) and attribute((opencl_global_device)).