coder.gpu.constantMemory - Pragma that maps a variable to the constant memory on GPU - MATLAB (original) (raw)
Pragma that maps a variable to the constant memory on GPU
Syntax
Description
coder.gpu.constantMemory([v](#mw%5Fe6c73469-396c-4da3-a555-425babaf5a28))
maps the variable v
to the constant memory space on the GPU device. Place this pragma within a parallelizable loop. If GPU Coder™ generates a kernel for the loop, it loads v
to a device constant memory variable. It replaces any access to this variable within the kernel by access to the constant memory variable. Within the kernel, the variable v
must be read-only. Otherwise, GPU Coder ignores this pragma. Use this pragma when every thread accesses every element of the parameter array or matrix.
This function is a code generation function. It has no effect in MATLAB®.
Examples
Map Read-Only Input to GPU Constant Memory
This example shows how to map an input to the constant memory space on the GPU by using the coder.gpu.constantMemory
pragma.
Write an entry-point function myFun
that accepts two inputs a
of size 256x256
and constantk
of size 1x3
. The function has a nested for
-loops that adds the constants to each element of a
. To create a kernel, place thecoder.gpu.kernel()
pragma outside the nestedfor
-loop. Thecoder.gpu.constantMemory(k)
places the read-only input k
into the constant memory of the GPU.
function b = myFun(a,k)
b = coder.nullcopy(zeros(size(a)));
coder.gpu.kernel();
for j = 1:256
for i = 1:256
coder.gpu.constantMemory(k);
b(i,j) = a(i,j) + k(1) + k(2) + k(3);
end
end
end
Create a configuration object for MEX code generation.
cfg = coder.gpuConfig('mex');
Define a cell array input
that declares the size and data type of the inputs a,k
to the functionmyFun
.
input = {ones(256),ones(1,3)}
Generate a MEX function myFun_mex
by using-config
, -args
, and-report
options to specify configuration, provide input arguments, and generate a code generation report.
codegen -config cfg -args input -report myFun
In the report, on the C code tab, clickmyFun.cu
.
The read-only variable k
is declared asconst_k
by using the __constant__
qualifier as shown in the code snippet.
/* Variable Definitions */ constant real_T const_k[3];
cudaMemcpyToSymbol
call copies the value ofk
from the host to the device constant memoryconst_k
.
cudaMemcpyToSymbol(const_k, k, 24U, 0U, cudaMemcpyHostToDevice); cudaMemcpy(gpu_a, a, 524288U, cudaMemcpyHostToDevice); myFun_kernel1<<<dim3(128U, 1U, 1U), dim3(512U, 1U, 1U)>>>(gpu_a, gpu_b); cudaMemcpy(b, gpu_b, 524288U, cudaMemcpyDeviceToHost);
The kernel body accesses the constant const_k
and adds it to each element of a
static global launch_bounds(512, 1) void myFun_kernel1(const real_T *a, real_T *b) { int32_T i; int32_T j; int32_T threadIdX; threadIdX = (int32_T)(blockDim.x * blockIdx.x + threadIdx.x); i = threadIdX / 256; j = threadIdX - i * 256; if ((!(j >= 256)) && (!(i >= 256))) { b[i + (j << 8)] = ((a[i + (j << 8)] + const_k[0]) + const_k[1]) + const_k[2]; } }
Input Arguments
v
— Variable name
scalar | vector | matrix | multidimensional array
The name of the variable that must be mapped to the constant memory space on the GPU device.
Version History
Introduced in R2017b
See Also
Apps
Functions
- codegen | coder.gpu.kernel | coder.gpu.kernelfun | gpucoder.stencilKernel | coder.gpu.nokernel | coder.gpu.persistentMemory