batch - Run MATLAB script or function on worker - MATLAB (original) (raw)
Run MATLAB script or function on worker
Syntax
Description
[j](#mw%5Fb37401d2-759b-443d-b3f0-cec56feedce3) = batch([script](#mw%5F5d697c91-ed67-4447-98d5-1a12a821f108))
runs the script file script
on a worker in the cluster specified by the default cluster profile. (Note: Do not include the .m
file extension with the script name.) The function returns j
, a handle to the job object that runs the script. The script file script
is copied to the worker.
By default, workspace variables are copied from the client to workers when you runbatch(script)
. Job and task objects are not copied to workers.
[j](#mw%5Fb37401d2-759b-443d-b3f0-cec56feedce3) = batch([expression](#mw%5F3227456c-db45-4519-bd70-c73f6ba5c381))
runs expression
as an expression on a worker in the cluster specified by the default cluster profile. The function returns j
, a handle to the job object that runs the expression.
By default, workspace variables are copied from the client to workers when you runbatch(expression)
. Job and task objects are not copied to workers.
[j](#mw%5Fb37401d2-759b-443d-b3f0-cec56feedce3) = batch([myCluster](#mw%5F6bb20502-63df-4681-9a8d-6e140af62521),[script](#mw%5F5d697c91-ed67-4447-98d5-1a12a821f108))
is identical to batch(script)
except that the script runs on a worker in the cluster specified by the cluster object myCluster
.
[j](#mw%5Fb37401d2-759b-443d-b3f0-cec56feedce3) = batch([myCluster](#mw%5F6bb20502-63df-4681-9a8d-6e140af62521),[expression](#mw%5F3227456c-db45-4519-bd70-c73f6ba5c381))
is identical to batch(expression)
except that the expression runs on a worker in the cluster specified by the cluster object myCluster
.
[j](#mw%5Fb37401d2-759b-443d-b3f0-cec56feedce3) = batch([fcn](#mw%5F544ceff3-0d8a-43f7-b2f7-085a8b5e8144),[N](#mw%5Fd9b4ad5e-b73e-4084-8984-d0745b3dc3fc),[{x1,...,xn}](#mw%5Fbb96ea56-dc47-4520-ab57-e715d1bcd354))
runs the function fcn
on a worker in the cluster specified by the default cluster profile. The function returns j
, a handle to the job object that runs the function. The function is evaluated with the given arguments,x1,...,xn
, and returns N
output arguments. The function file for fcn
is copied to the worker. (Note: Do not include the.m
file extension with the function name argument.)
[j](#mw%5Fb37401d2-759b-443d-b3f0-cec56feedce3) = batch([myCluster](#mw%5F6bb20502-63df-4681-9a8d-6e140af62521),[fcn](#mw%5F544ceff3-0d8a-43f7-b2f7-085a8b5e8144),[N](#mw%5Fd9b4ad5e-b73e-4084-8984-d0745b3dc3fc),[{x1,...,xn}](#mw%5Fbb96ea56-dc47-4520-ab57-e715d1bcd354))
is identical to batch(fcn,N,{x1,...,xn})
except that the function runs on a worker in the cluster specified by the cluster object myCluster
.
[j](#mw%5Fb37401d2-759b-443d-b3f0-cec56feedce3) = batch(___,[Name,Value](#namevaluepairarguments))
specifies options that modify the behavior of a job using one or more name-value arguments. These options support batch for functions and scripts, unless otherwise indicated. Use this syntax in addition to any of the input argument combinations in previous syntaxes.
Examples
Run Script as Batch Job
This example shows how to use batch
to offload work to a MATLAB® worker session that runs in the background.
You can continue using MATLAB while computations take place.
Run a script as a batch job by using the batch function. By default, batch
uses your default cluster profile. Check your default cluster profile on the MATLAB Home tab, in the Environment section, in Parallel > Select Parallel Environment. Alternatively, you can specify a cluster profile with the 'Profile'
name-value pair argument.
batch
does not block MATLAB and you can continue working while computations take place.
If you want to block MATLAB until the job finishes, use the wait
function on the job object.
By default, MATLAB saves the Command Window output from the batch job to the diary of the job. To retrieve it, use the diary
function.
--- Start Diary --- n = 100
--- End Diary ---
After the job finishes, fetch the results by using the load function.
If you want to load all the variables in the batch job, use load(job)
instead.
When you have loaded all the required variables, delete the job object to clean up its data and avoid consuming resources unnecessarily.
Note that if you send a script file using batch
, MATLAB transfers all the workspace variables to the cluster, even if your script does not use them. The data transfer time for a large workspace can be substantial. As a best practice, convert your script to a function file to avoid this communication overhead. For an example that uses a function, see Run Batch Job and Access Files from Workers.
For more advanced options with batch
, see Run Batch Job and Access Files from Workers.
Run Batch Job and Access Files from Workers
You can offload your computations to run in the background by using batch
.
If your code needs access to files, you can use additional options, such as 'AttachedFiles'
or 'AdditionalPaths'
, to make the data accessible. You can continue working in MATLAB while the computations take place. If you submit your computations to a remote cluster, you can close MATLAB and recover the results later.
Prepare Example
Use the supporting function prepareSupportingFiles to copy the required data for this example to your current working folder.
Your current working folder now contains 4 files: A.dat
, B1.dat
, B2.dat
, and B3.dat
.
Run Batch Job
Create a cluster object using parcluster. By default, parcluster
uses your default cluster profile. Check your default cluster profile on the MATLAB Home tab, in the Environment section, in Parallel > Select a Default Cluster.
Place your code inside a function and submit it as a batch job by using batch. For an example of a custom function, see the supporting function divideData. Specify the expected number of output arguments and a cell array with inputs to the function.
Note that if you send a script file using batch, MATLAB transfers all the workspace variables to the cluster, even if your script does not use them. If you have a large workspace, it impacts negatively the data transfer time. As a best practice, convert your script to a function file to avoid this communication overhead. You can do this by simply adding a function line at the beginning of your script. To reduce overhead in this example, divideData
is defined in a file outside of this live script.
If your code uses a parallel pool, use the 'Pool'
name-value pair argument to create a parallel pool with the number of workers that you specify. batch
uses an additional worker to run the function itself.
By default, batch
changes the initial working folder of the workers to the current folder of the MATLAB client. It can be useful to control the initial working folder in the workers. For example, you might want to control it if your cluster uses a different filesystem, and therefore the paths are different, such as when you submit from a Windows client machine to a Linux cluster.
- To keep the initial working folder of the workers and use their default, set
'CurrentFolder'
to'.'
. - To change the initial working folder, set
'CurrentFolder'
to a folder of your choice.
This example uses a parallel pool with three workers and chooses a temporary location for the initial working folder. Use batch
to offload the computations in divideData
.
job = batch(c,@divideData,1,{}, ... 'Pool',3, ... 'CurrentFolder',tempdir);
batch
runs divideData
on a parallel worker, so you can continue working in MATLAB while computations take place.
If you want to block MATLAB until the job completes, use the wait
function on the job object.
To retrieve the results, use fetchOutputs
on the job object. As divideData
depends on a file that the workers cannot find, fetchOutputs
throws an error. You can access error information by using getReport
on the Error
property of Task
objects in the job. In this example, the code depends on a file that the workers cannot find.
getReport(job.Tasks(1).Error)
ans = 'Error using divideData (line 4) Unable to read file 'B2.dat'. No such file or directory.'
Access Files from Workers
By default, batch
automatically analyzes your code and transfers required files to the workers. In some cases, you must explicitly transfer those files -- for example, when you determine the name of a file at runtime.
In this example, divideData
accesses the supporting file A.dat
, which batch
automatically detects and transfers. The function also accesses B1.dat
, but it resolves the name of the file at runtime, so the automatic dependency analysis does not detect it.
function X = divideData() A = load("A.dat"); X = zeros(flip(size(A))); parfor i = 1:3 B = load("B" + i + ".dat"); X = X + A\B; end end
If the data is in a location that the workers can access, you can use the name-value pair argument 'AdditionalPaths'
to specify the location. 'AdditionalPaths
' adds this path to the MATLAB search path of the workers and makes the data visible to them.
pathToData = pwd; job(2) = batch(c,@divideData,1,{}, ... 'Pool',3, ... 'CurrentFolder',tempdir, ... 'AdditionalPaths',pathToData); wait(job(2));
If the data is in a location that the workers cannot access, you can transfer files to the workers by using the 'AttachedFiles'
name-value pair argument. You need to transfer files if the client and workers do not share the same file system, or if your cluster uses the generic scheduler interface in nonshared mode. For more information, see Configure Using the Generic Scheduler Interface (MATLAB Parallel Server).
filenames = "B" + string(1:3) + ".dat"; job(3) = batch(c,@divideData,1,{}, ... 'Pool',3, ... 'CurrentFolder',tempdir, ... 'AttachedFiles',filenames);
Find Existing Job
If you submit the job to a remove cluster, you can close MATLAB after job submission and retrieve the results later. Before you close MATLAB, make a note of the job ID.
When you open MATLAB again, you can find the job by using the findJob
function.
job(3) = findJob(c,'ID',job3ID); wait(job(3));
Alternatively, you can use the Job Monitor to track your job. You can open it from the MATLAB Home tab, in the Environment section, in Parallel > Monitor Jobs.
Retrieve Results and Clean Up Data
To retrieve the results of a batch job, use the fetchOutputs
function. fetchOutputs
returns a cell array with the outputs of the function run with batch
.
X = 1×1 cell array {40×207 double}
When you have retrieved all the required outputs and do not need the job object anymore, delete it to clean up its data and avoid consuming resources unnecessarily.
Input Arguments
script
— MATLAB® script
character vector | string scalar
MATLAB script, specified as a character vector or string scalar.
By default, workspace variables are copied from the client to workers when you specify this argument. Job and task objects are not copied to workers.
Example: batch('aScript');
Data Types: char
| string
expression
— Expression to evaluate
character vector | string scalar
Expression to evaluate, specified as a character vector or string scalar.
By default, workspace variables are copied from the client to workers when you specify this argument. Job and task objects are not copied to workers.
Example: batch('y = magic(3)');
Data Types: char
| string
myCluster
— Cluster
parallel.Cluster
object
Cluster, specified as a parallel.Cluster
object that represents cluster compute resources. To create the object, use the parcluster function.
Example: cluster = parcluster; batch(cluster,'aScript');
Data Types: parallel.Cluster
fcn
— Function to be evaluated by the worker
function handle | character vector
Function to be evaluated by the worker, specified as a function handle or function name.
Example: batch(@myFunction,1,{x,y});
Data Types: char
| string
| function_handle
N
— Number of outputs
nonnegative integer
Number of outputs expected from the evaluated function fcn, specified as a nonnegative integer.
Example: batch(@myFunction,1,{x,y});
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
{x1,...,xn}
— Input arguments
cell array
Input arguments to the function fcn, specified as a cell array.
Example: batch(@myFunction,1,{x,y});
Data Types: cell
Name-Value Arguments
Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN
, where Name
is the argument name and Value
is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose Name
in quotes.
Example: j = batch(@myFunction,1,{x,y},'Pool',3);
Workspace
— Variables to copy to workers
structure scalar
Variables to copy to workers, specified as the comma-separated pair consisting of'Workspace'
and a structure scalar.
The default value is a structure scalar with fields corresponding to variables in the client workspace. Specify variables as fields in the structure scalar.
Workspace variables are only copied from the client to workers if you specifyscript or expression. Job and task objects are not copied to workers.
Example: workspace.myVar = 5; j = batch('aScript','Workspace',workspace);
Data Types: struct
Profile
— Cluster profile
character vector | string
Cluster profile used to identify the cluster, specified as the comma-separated pair consisting of 'Profile'
and a character vector or string. If this option is omitted, the default profile is used to identify the cluster and is applied to the job and task properties.
Example: j = batch('aScript','Profile','Processes');
Data Types: char
| string
AdditionalPaths
— Paths to add to workers
character vector | string array | cell array of character vectors
Paths to add to the MATLAB search path of the workers before the script or function executes, specified as the comma-separated pair consisting of'AdditionalPaths'
and a character vector, string array, or cell array of character vectors.
The default search path might not be the same on the workers as it is on the client; the path difference could be the result of different current working folders (cwd
), platforms, or network file system access. Specifying the'AdditionalPaths'
name-value argument helps ensure that workers look for files, such as code files, data files, or model files, in the correct locations.
You can use 'AdditionalPaths'
to access files in a shared file system. Note that path representations can vary depending on the target machines.'AdditionalPaths'
must be the paths as seen by the machines in the cluster. For example, if Z:\data
on your local Windows® machine is /network/data
to your Linux® cluster, then add the latter to 'AdditionalPaths'
. If you use a datastore, use 'AlternateFileSystemRoots'
instead to deal with other representations. For more information, see Set Up Datastore for Processing on Different Machines or Clusters.
Note that AdditionalPaths
only helps to find files when you refer to them using a relative path or file name, and not an absolute path.
Example: j = batch(@myFunction,1,{x,y},'AdditionalPaths','/network/data/');
Data Types: char
| string
| cell
AttachedFiles
— Files or folders to transfer
character vector | string array | cell array of character vectors
Files or folders to transfer to the workers, specified as the comma-separated pair consisting of 'AttachedFiles'
and a character vector, string array, or cell array of character vectors.
Example: j = batch(@myFunction,1,{x,y},'AttachedFiles','myData.dat');
Data Types: char
| string
| cell
AutoAddClientPath
— Flag to add user-added entries on client path to worker path
true
(default) | false
Flag to add user-added entries on the client path to worker paths, specified as the comma-separated pair consisting of 'AutoAddClientPath'
and a logical value.
Example: j = batch(@myFunction,1,{x,y},'AutoAddClientPath',false);
Data Types: logical
AutoAttachFiles
— Flag to enable dependency analysis
true
(default) | false
Flag to enable dependency analysis and automatically attach code files to the job, specified as the comma-separated pair consisting of'AutoAttachFiles'
and a logical value. If you set the value totrue
, the batch script or function is analyzed and the code files that it depends on are automatically transferred to the workers.
Example: j = batch(@myFunction,1,{x,y},'AutoAttachFiles',true);
Data Types: logical
CurrentFolder
— Folder in which the script or function executes
character vector | string
Folder in which the script or function executes, specified as the comma-separated pair consisting of 'CurrentFolder'
and a character vector or string. There is no guarantee that this folder exists on the worker. The default value for this property is the current directory of MATLAB when the batch
command is executed. If the argument is '.'
, there is no change in folder before batch execution.
Example: j = batch(@myFunction,1,{x,y},'CurrentFolder','.');
Data Types: char
| string
CaptureDiary
— Flag to collect the diary
true
(default) | false
Flag to collect the diary from the function call, specified as the comma-separated pair consisting of 'CaptureDiary'
and a logical value. For information on the collected data, see diary.
Example: j = batch('aScript','CaptureDiary',false);
Data Types: logical
EnvironmentVariables
— Environment variables to copy
character vector | string array | cell array of character vectors
Environment variables to copy from the client session to the workers, specified as the comma-separated pair consisting of 'EnvironmentVariables'
and a character vector, string array, or cell array of character vectors. The names specified here are appended to the EnvironmentVariables
property specified in the applicable parallel profile to form the complete list of environment variables. Listed variables that are not set are not copied to the workers. These environment variables are set on the workers for the duration of the batch job.
Example: j = batch('aScript','EnvironmentVariables',"MY_ENV_VAR");
Data Types: char
| string
| cell
Pool
— Number of workers to make into a parallel pool
0
(default) | nonnegative integer | 2-element vector of nonnegative integers
Number of workers to make into a parallel pool, specified as the comma-separated pair consisting of 'Pool'
and either:
- A nonnegative integer.
- A 2-element vector of nonnegative integers, which is interpreted as a range. The size of the resulting parallel pool is as large as possible in the range requested.
In addition, note that batch
uses another worker to run the batch job itself.
The script or function uses this pool to execution statements such as parfor and spmd that are inside the batch code. Because the pool requires N
workers in addition to the worker running the batch, the cluster must have at least N+1
workers available. You do not need a parallel pool already running to executebatch
, and the new pool that batch creates is not related to a pool you might already have open. For more information, see Run a Batch Job with a Parallel Pool.
If you use the default value, 0
, the script or function runs on only a single worker and not on a parallel pool.
batch
supports pools with up to 2000 workers. (since R2024a)
Example: j = batch(@myFunction,1,{x,y},'Pool',4);
Example: j = batch(@myFunction,1,{x,y},'Pool',[2,6]);
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
SpmdEnabled
— Flag to specify if spmd
is supported for pool batch job
true
(default) | false
Since R2024a
Flag to specify if spmd
support is enabled on the parallel pool of a batch job, specified as a logical value. You can disable support only on a local or MATLAB Job Scheduler cluster.
parfor
iterations do not involve communication between workers. Therefore, if 'SpmdEnabled'
is false
, aparfor
-loop continues even if one or more workers aborts during loop execution.
Data Types: logical
Output Arguments
j
— Job
parallel.Job
object
Job, returned as a parallel.Job object.
Example: j = batch('aScript');
Data Types: parallel.Job
Tips
- To view the status or track the progress of a batch job, use the Job Monitor, as described in Job Monitor. You can also use the Job Monitor to retrieve a job object for a batch job that was created in a different session, or for a batch job that was created without returning a job object from the
batch
call. - Delete any batch jobs you no longer need to avoid consuming cluster storage resources unnecessarily.
- To develop and test your code, you can run batch job on a local cluster on your client machine. If you close your MATLAB session, any batch jobs using the local cluster also stop immediately.
- When you offload work to a remote cluster, you can close the MATLAB client session while the job is processing and retrieve information from a batch job later or in a new client session.
Version History
Introduced in R2008a
R2024a: Support for batch
job pools with up to 2000 workers.
Starting in R2024a, batch
supports pools with up to 2000 workers. Before R2024a, batch
supports pools with up to 1000 workers.
R2024a: Disable spmd communication between workers for batch job pools
When you use the 'Pool'
name-value argument to create a parallel pool, the software creates a pool with spmd communication enabled by default. To use pools without spmd communication enabled, use the 'SpmdEnabled'
name-value argument to disable spmd support.
R2021a: batch
now evaluates cell array input arguments {C1,...,Cn}
as C1,...,Cn
Starting in R2021a, a function fcn
offloaded withbatch
evaluates cell array input arguments{C1,...,Cn}
as fcn(C1,...,Cn)
. In previous releases{C1,...,Cn}
threw an error and {{C1,...,Cn}}
was evaluated as fcn(C1,...,Cn)
.
Starting in R2021a, use the following code to offloadfcn({a,b},{c,d})
on the cluster myCluster
with one output.
batch(myCluster,@fcn,1,{{a,b},{c,d}});
In previous releases, you used the following code instead.
batch(myCluster,@fcn,1,{{{a,b},{c,d}}});