Integrate External/Custom Code - MATLAB & Simulink (original) (raw)

This example shows how to integrate external or custom code to enhance performance of generated code. AlthoughMATLAB® Coder™ generates optimized code for most applications, you might have custom code optimized for your specific requirements. For example:

In such cases, you can integrate your custom code with the code generated by MATLAB Coder.

This example illustrates how to integrate the functioncublasSgemm from the NVIDIA® CUDA® Basic Linear Algebra Subroutines (CUBLAS) library in generated code. This function performs matrix multiplication on a Graphics Processing Unit (GPU).

  1. Define a class ExternalLib_API that derives from the class coder.ExternalDependency.ExternalLib_API defines an interface to theCUBLAS library through the following methods:

    • getDescriptiveName: Returns a descriptive name for ExternalLib_API to be used for error messages.
    • isSupportedContext: Determines if the build context supports the CUBLAS library.
    • updateBuildInfo: Adds header file paths and link files to the build information.
    • GPU_MatrixMultiply: Defines the interface to the CUBLAS library functioncublasSgemm.
      ExternalLib_API.m
      classdef ExternalLib_API < coder.ExternalDependency
      %#codegen

    methods (Static)

     function bName = getDescriptiveName(~)  
         bName = 'ExternalLib_API';  
     end  
       
     function tf = isSupportedContext(ctx)  
         if  ctx.isMatlabHostTarget()  
             tf = true;  
         else  
             error('CUBLAS library not available for this target');  
         end  
     end  
       
     function updateBuildInfo(buildInfo, ctx)  
         [~, linkLibExt, ~, ~] = ctx.getStdLibInfo();  
           
         % Include header file path  
         % Include header files later using coder.cinclude  
         hdrFilePath = 'C:\My_Includes';  
         buildInfo.addIncludePaths(hdrFilePath);  
           
         % Include link files  
         linkFiles = strcat('libcublas', linkLibExt);  
         linkPath = 'C:\My_Libs';  
         linkPriority = '';  
         linkPrecompiled = true;  
         linkLinkOnly = true;  
         group = '';  
         buildInfo.addLinkObjects(linkFiles, linkPath, ...  
             linkPriority, linkPrecompiled, linkLinkOnly, group);  
           
         linkFiles = strcat('libcudart', linkLibExt);  
         buildInfo.addLinkObjects(linkFiles, linkPath, ...  
             linkPriority, linkPrecompiled, linkLinkOnly, group);  
           
     end  
       
     %API for library function 'cuda_MatrixMultiply'  
     function C = GPU_MatrixMultiply(A, B)  
         assert(isa(A,'single'), 'A must be single.');  
         assert(isa(B,'single'), 'B must be single.');  
           
         if(coder.target('MATLAB'))  
             C=A*B;  
         else  
               
             % Include header files  
             %     for external functions and typedefs  
             % Header path included earlier using updateBuildInfo  
             coder.cinclude('"cuda_runtime.h"');  
             coder.cinclude('"cublas_v2.h"');  
               
             % Compute dimensions of input matrices  
             m = int32(size(A, 1));  
             k = int32(size(A, 2));  
             n = int32(size(B, 2));  
               
             % Declare pointers to matrices on destination GPU  
             d_A = coder.opaque('float*');  
             d_B = coder.opaque('float*');  
             d_C = coder.opaque('float*');  
               
             % Compute memory to be allocated for matrices  
             % Single = 4 bytes  
             size_A = m*k*4;  
             size_B = k*n*4;  
             size_C = m*n*4;  
               
             % Define error variables  
             error = coder.opaque('cudaError_t');  
             cudaSuccessV = coder.opaque('cudaError_t', ...  
                 'cudaSuccess');  
               
             % Assign memory on destination GPU  
             error = coder.ceval('cudaMalloc', ...  
                 coder.wref(d_A), size_A);  
             assert(error == cudaSuccessV, ...  
                 'cudaMalloc(A) failed');  
             error = coder.ceval('cudaMalloc', ...  
                 coder.wref(d_B), size_B);  
             assert(error == cudaSuccessV, ...  
                 'cudaMalloc(B) failed');  
             error = coder.ceval('cudaMalloc', ...  
                 coder.wref(d_C), size_C);  
             assert(error == cudaSuccessV, ...  
                 'cudaMalloc(C) failed');  
               
             % Define direction of copying  
             hostToDevice = coder.opaque('cudaMemcpyKind', ...  
                 'cudaMemcpyHostToDevice');  
               
             % Copy matrices to destination GPU  
             error = coder.ceval('cudaMemcpy',  ...  
                 d_A, coder.rref(A), size_A, hostToDevice);  
             assert(error == cudaSuccessV, 'cudaMemcpy(A) failed');  
               
             error = coder.ceval('cudaMemcpy',  ...  
                 d_B, coder.rref(B), size_B, hostToDevice);  
             assert(error == cudaSuccessV, 'cudaMemcpy(B) failed');  
               
             % Define type and size for result  
             C = zeros(m, n, 'single');  
               
             error = coder.ceval('cudaMemcpy', ...  
                 d_C, coder.rref(C), size_C, hostToDevice);  
             assert(error == cudaSuccessV, 'cudaMemcpy(C) failed');  
               
             % Define handle variables for external library  
             handle = coder.opaque('cublasHandle_t');  
             blasSuccess = coder.opaque('cublasStatus_t', ...  
                 'CUBLAS_STATUS_SUCCESS');  
               
             % Initialize external library  
             ret = coder.opaque('cublasStatus_t');  
             ret = coder.ceval('cublasCreate', coder.wref(handle));  
             assert(ret == blasSuccess, 'cublasCreate failed');  
               
              
             TRANSA = coder.opaque('cublasOperation_t', ...  
                 'CUBLAS_OP_N');  
             alpha = single(1);  
             beta = single(0);  
               
             % Multiply matrices on GPU  
             ret = coder.ceval('cublasSgemm', handle, ...  
                 TRANSA,TRANSA,m,n,k, ...  
                 coder.rref(alpha),d_A,m, ...  
                 d_B,k, ...  
                 coder.rref(beta),d_C,k);  
               
             assert(ret == blasSuccess, 'cublasSgemm failed');  
               
             % Copy result back to local host  
             deviceToHost = coder.opaque('cudaMemcpyKind', ...  
                 'cudaMemcpyDeviceToHost');  
             error = coder.ceval('cudaMemcpy', coder.wref(C), ...  
                 d_C, size_C, deviceToHost);  
             assert(error == cudaSuccessV, 'cudaMemcpy(C) failed');  
               
         end  
     end  

    end

end 2. To perform the matrix multiplication using the interface defined in method GPU_MatrixMultiply and the build information in ExternalLib_API, include the following line in your MATLAB code:
C= ExternalLib_API.GPU_MatrixMultiply(A,B);
For instance, you can define a MATLAB function Matrix_Multiply that solely performs this matrix multiplication.
function C = Matrix_Multiply(A, B) %#codegen
C= ExternalLib_API.GPU_MatrixMultiply(A,B); 3. Define a MEX configuration object usingcoder.config. For using theCUBLAS libraries, set the target language for code generation to C++.
cfg=coder.config('mex');
cfg.TargetLang='C++'; 4. Generate code for Matrix_Multiply usingcfg as the configuration object and two2 X 2 matrices of type single as arguments. Since cublasSgemm supports matrix multiplication for data type float, the corresponding MATLAB matrices must have type single.
codegen -config cfg Matrix_Multiply ...
-args {ones(2,'single'),ones(2,'single')} 5. Test the generated MEX functionMatrix_Multiply_mex using two 2 X 2 identity matrices of typesingle.
Matrix_Multiply_mex(eye(2,'single'),eye(2,'single'))
The output is also a 2 X 2 identity matrix.

See Also

coder.ceval | coder.opaque | coder.rref | coder.wref | assert | coder.ExternalDependency | coder.BuildConfig