parpool - Create parallel pool on cluster - MATLAB (original) (raw)
Create parallel pool on cluster
Syntax
Description
parpool
starts a parallel pool of workers using the default profile. With default settings, MATLAB® starts a pool on the local machine with one worker per physical CPU core up to the limit set in the default profile. For more information on parallel settings, see Specify Your Parallel Settings.
In general, the pool size is specified by the PreferredPoolNumWorkers
property of the default profile. For all factors that can affect your pool size, seeFactors That Affect Pool Size.
parpool
enables the full functionality of the parallel language features in MATLAB by creating a special job on a pool of workers, and connecting the MATLAB client to the parallel pool. Parallel language features include parfor
, parfeval
, parfevalOnAll
, spmd
, and distributed
. If possible, the working folder on the workers is set to match that of the MATLAB client session.
parpool([poolsize](#btyaatq-poolsize))
creates and returns a pool with the specified number of workers. poolsize
can be a positive integer or a range specified as a 2-element vector of integers. If poolsize
is a range, the resulting pool has size as large as possible in the range requested.
Specifying the poolsize
overrides any value specified in the PreferredPoolNumWorkers
property, and starts a pool of exactly that number of workers, even if it has to wait for them to be available. Most clusters have a maximum number of workers they can start. If the profile specifies a MATLAB Job Scheduler cluster, parpool
reserves its workers from among those already running and available under that MATLAB Job Scheduler. If the profile specifies a local or third-party scheduler, parpool
instructs the scheduler to start the workers for the pool.
parpool([resources](#mw%5Fbc0fbc13-b265-48fb-8908-23100cb71e83))
or parpool(`resources`,[poolsize](#btyaatq-poolsize))
starts a worker pool on the resources specified by resources
.
parpool(___,[Name=Value](#namevaluepairarguments))
applies the specified values for certain properties when starting the pool.
[poolobj](#btyaatq-poolobj) = parpool(___)
returns a parallel.Pool object to the client workspace representing the pool on the cluster. You can use the pool object to programmatically delete the pool or to access its properties. Use delete(pool)
to shut down the parallel pool.
Examples
Start a parallel pool using the default profile to define the number of workers. With default settings, the default pool is on the local machine.
You can create pools on different types of parallel environments on your local machine.
Start a parallel pool of 16 workers using a profile called myProf
.
Create an object representing the cluster identified by the default profile, and use that cluster object to start a parallel pool. The pool size is determined by the default profile.
c = parcluster parpool(c)
Start a parallel pool with the default profile, and pass two code files to the workers.
parpool(AttachedFiles=["mod1.m","mod2.m"])
If you have access to several GPUs, you can perform your calculations on multiple GPUs in parallel using a parallel pool.
To determine the number of GPUs that are available for use in MATLAB, use the gpuDeviceCount function.
availableGPUs = gpuDeviceCount("available")
Start a parallel pool with as many workers as available GPUs. For best performance, MATLAB assigns a different GPU to each worker by default.
parpool("Processes",availableGPUs);
Starting parallel pool (parpool) using the 'Processes' profile ... Connected to the parallel pool (number of workers: 3).
To identify which GPU each worker is using, call gpuDevice
inside an spmd
block. The spmd
block runs gpuDevice
on every worker.
Use parallel language features, such as parfor or parfeval, to distribute your computations to workers in the parallel pool. If you use gpuArray enabled functions in your computations, these functions run on the GPU of the worker. For more information, see Run MATLAB Functions on a GPU. For an example, see Run MATLAB Functions on Multiple GPUs.
When you are done with your computations, shut down the parallel pool. You can use the gcp function to obtain the current parallel pool.
If you want to use a different choice of GPUs, then you can use gpuDevice
to select a particular GPU on each worker, using the GPU device index. You can obtain the index of each GPU device in your system using the gpuDeviceCount
function.
Suppose you have three GPUs available in your system, but you want to use only two for a computation. Obtain the indices of the devices.
[availableGPUs,gpuIndx] = gpuDeviceCount("available")
Define the indices of the devices you want to use.
Start your parallel pool. Use an spmd
block and gpuDevice
to associate each worker with one of the GPUs you want to use, using the device index. The spmdIndex
function identifies the index of each worker.
parpool("Processes",numel(useGPUs));
Starting parallel pool (parpool) using the 'Processes' profile ... Connected to the parallel pool (number of workers: 2).
spmd gpuDevice(useGPUs(spmdIndex)); end
As a best practice, and for best performance, assign a different GPU to each worker.
When you are done with your computations, shut down the parallel pool.
Create a parallel pool with the default profile, and later delete the pool.
poolobj = parpool;
delete(poolobj)
Find the number of workers in the current parallel pool.
poolobj = gcp("nocreate"); % If no pool, do not create new one. if isempty(poolobj) poolsize = 0; else poolsize = poolobj.NumWorkers end
Input Arguments
Size of the parallel pool, specified as a positive integer or a range specified as a 2-element vector of integers. If poolsize
is a range, the resulting pool has size as large as possible in the range requested. Set the default preferred number of workers in the cluster profile.
parpool
supports pools with up to 2000 workers. (since R2024a)
Example: parpool("Processes",2)
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
Resources to start the pool on, specified as "Processes"
,"Threads"
, a cluster profile name or cluster object.
"Processes"
– Starts a pool of process workers on the local machine. For more information on process-based environments, see Choose Between Thread-Based and Process-Based Environments."Threads"
– Starts a pool of thread workers on the local machine. For more information on thread-based environments, see Choose Between Thread-Based and Process-Based Environments.- Profile name – Starts a pool on the cluster specified by the profile. For more information on cluster profiles, see Discover Clusters and Use Cluster Profiles.
- Cluster object – Starts a pool on the cluster specified by the cluster object. Use parcluster to get a cluster object.
Example: parpool("Processes")
Example: parpool("Threads")
Example: parpool("myClusterProfile",16)
Example: c = parcluster; parpool(c)
Data Types: char
| string
| parallel.Cluster
Name-Value Arguments
Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN
, where Name
is the argument name and Value
is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose Name
in quotes.
Example: AttachedFiles="myFun.m"
Since R2025a
Paths to add to the MATLAB search path of the workers at the time of pool creation, specified as a character vector, string or string array, or cell array of character vectors.
The default search path might not be the same on the workers as it is on the client; the path difference could be the result of different current working folders (cwd
), platforms, or network file system access. Specifying theAdditionalPaths
name-value argument helps ensure that workers look for files, such as code files, data files, or model files, in the correct locations.
You can use AdditionalPaths
to access files in a shared file system. Note that path representations can vary depending on the target machines. AdditionalPaths
must be the paths as seen by the machines in the cluster. For example, ifZ:\data
on your local Windows® machine is /network/data
to your Linux® cluster, then add the latter toAdditionalPaths
. If you use a datastore, use'AlternateFileSystemRoots'
instead to deal with other representations. For more information, see Set Up Datastore for Processing on Different Machines or Clusters.
Note that AdditionalPaths
only helps to find files when you refer to them using a relative path or file name, and not an absolute path.
Example: "/network/data/"
Data Types: char
| string
| cell
Files to attach to pool, specified as a character vector, string or string array, or cell array of character vectors.
parpool
starts a parallel pool and passes the identified files to the workers in the pool. The files specified here are appended to the`AttachedFiles`
property specified in the applicable parallel profile to form the complete list of attached files. The AttachedFiles
property name is case sensitive, and must appear as shown.
Example: ["myFun.m","myFun2.m"]
Data Types: char
| cell
Flag to specify if user-added entries on the client path are added to path of each worker at startup, specified as a logical value.
Data Types: logical
Names of environment variables to copy from the client session to the workers, specified as a character vector, string or string array, or cell array of character vectors. The names specified here are appended to theEnvironmentVariables
property specified in the applicable parallel profile to form the complete list of environment variables. Any variables listed which are not set are not copied to the workers. These environment variables are set on the workers for the duration of the parallel pool.
Data Types: char
| cell
Flag to specify if spmd
support is enabled on the pool, specified as a logical value. You can disable support only on a local or MATLAB Job Scheduler cluster. parfor
iterations do not involve communication between workers. Therefore, ifSpmdEnabled
is false
, aparfor
-loop continues even if one or more workers aborts during loop execution.
Data Types: logical
Time in minutes after which the pool shuts down if idle, specified as an integer greater than zero. A pool is idle if it is not running code on the workers. By default the IdleTimeout
property value is the same as the value in your parallel settings. For more information on parallel settings, see Specify Your Parallel Settings.
Example: pool = parpool(IdleTimeout=120)
Tips
- The pool status indicator in the lower-left corner of the desktop shows the client session connection to the pool and the pool status. Click the icon for a menu of supported pool actions.
With a pool running:With no pool running:
- If you set your parallel settings to automatically create a parallel pool when necessary, you do not need to explicitly call the
parpool
command. You might explicitly create a pool to control when you incur the overhead time of setting it up, so the pool is ready for subsequent parallel language constructs. delete(poolobj)
shuts down the parallel pool. Without a parallel pool,spmd
andparfor
run as a single thread in the client, unless your parallel settings are set to automatically start a parallel pool for them.- When you use the MATLAB editor to update files on the client that are attached to a parallel pool, those updates automatically propagate to the workers in the pool. (This automatic updating does not apply to Simulink® model files. To propagate updated model files to the workers, use the updateAttachedFiles function.)
- If possible, MATLAB initially sets the working folder on the workers to match that of the MATLAB client session. Subsequently, if you run the following commands on the client, MATLAB also executes the command on all the workers in the pool:
- cd
- addpath
- rmpath
This behavior allows you to set the working folder and the command search path on all the workers, so that subsequent pool activities such asparfor
-loops execute in the proper context.
When changing folders or adding a path withcd
oraddpath
on clients with Windows operating systems, the value sent to the workers is the UNC path for the folder if possible. For clients with Linux operating systems, it is the absolute folder location.
If any of these commands does not work on the client, it is not executed on the workers either. For example, ifaddpath
specifies a folder that the client cannot access, theaddpath
command is not executed on the workers. However, if the working folder can be set on the client, but cannot be set as specified on any of the workers, you do not get an error message returned to the client Command Window.
Be careful of this slight difference in behavior in a mixed-platform environment where the client is not the same platform as the workers, where folders local to or mapped from the client are not available in the same way to the workers, or where folders are in a nonshared file system. For example, if you have a MATLAB client running on a Microsoft® Windows operating system while the MATLAB workers are all running on Linux operating systems, the same argument toaddpath
cannot work on both. In this situation, you can use the function pctRunOnAll to assure that a command runs on all the workers.
Another difference between client and workers is that anyaddpath
arguments that are part of the matlabroot folder are not set on the workers. The assumption is that the MATLAB install base is already included in the workers' paths. The rules foraddpath
regarding workers in the pool are: - Subfolders of the
matlabroot
folder are not sent to the workers. - Any folders that appear before the first occurrence of a
matlabroot
folder are added to the top of the path on the workers. - Any folders that appear after the first occurrence of a
matlabroot
folder are added after thematlabroot
group of folders on the workers' paths.
For example, suppose thatmatlabroot
on the client isC:\Applications\matlab\
. With an open parallel pool, execute the following to set the path on the client and all workers:
addpath("P1",
"P2",
"C:\Applications\matlab\T3",
"C:\Applications\matlab\T4",
"P5",
"C:\Applications\matlab\T6",
"P7",
"P8");
Because T3
, T4
, andT6
are subfolders of matlabroot
, they are not set on the workers' paths. So on the workers, the pertinent part of the path resulting from this command is:
P1
P2
P5
P7
P8
- If you are using Macintosh or Linux, and see problems during large parallel pool creation, see Recommended System Limits for Macintosh and Linux.
Version History
Introduced in R2013b
Add folders to the MATLAB search path of workers in the parallel pool using the AdditionalPaths name-value argument to ensure that workers look for files in the correct locations.
Starting in R2024a, parpool
supports pools with up to 2000 workers. Before R2024a, parpool
supports pools with up to 1000 workers.
Starting in R2022b, the local
profile has been renamed to Processes
. There are no plans to remove local
. To start a parallel pool of process workers on the local machine, use Processes
instead.
Starting in R2022b, you can now specify the pool size of a thread-based parallel pool using the parpool(`poolsize`)
syntax.