Create and Use Distributed Arrays - MATLAB & Simulink (original) (raw)

Main Content

If your data is currently in the memory of your local machine, you can use thedistributed function to distribute an existing array from the client workspace to the workers of a parallel pool. Distributed arrays use the combined memory of multiple workers in a parallel pool to store the elements of an array. For alternative ways of partitioning data, see Distributing Arrays to Parallel Workers. You operate on the entire array as a single entity, however, workers operate only on their part of the array, and automatically transfer data between themselves when necessary. You can use distributed arrays to scale up your big data computation. Consider distributed arrays when you have access to a cluster, as you can combine the memory of multiple machines in your cluster.

A distributed array is a single variable, split over multiple workers in your parallel pool. You can work with this variable as one single entity, without having to worry about its distributed nature. To explore the functionalities available for distributed arrays in the Parallel Computing Toolbox™, see Run MATLAB Functions with Distributed Arrays.

When you create a distributed array, you cannot control the details of the distribution. On the other hand, codistributed arrays allow you to control all aspects of distribution, including dimensions and partitions. In the following, you learn how to create both distributed andcodistributed arrays.

Creating Distributed Arrays

You can create a distributed array in different ways:

In this example, you create an array in the client workspace, then turn it into a distributed array.

Create a new pool if you do not have an existing one open.

Create a magic 4-by-4 matrix on the client and distribute the matrix to the workers. View the results on the client and display information about the variables.

A = magic(4); B = distributed(A); B whos

Name Size Bytes Class Attributes

A 4x4 128 double
B 4x4 128 distributed

You have created B as a distributed array, split over the workers in your parallel pool. This is shown in the figure below. The distributed array is ready for further computations.

Array split by column over four workers.

Close the pool after you have finished using the distributed array.

Creating Codistributed Arrays

Unlike distributed arrays, codistributed arrays allow you to control all aspects of distribution, including dimensions and partitions. You can create a codistributed array in different ways:

In this example, you create a codistributed array inside anspmd statement, using a nondefault distribution scheme.

Create a parallel pool with two workers.

In an spmd statement, define a 1-D distribution along the third dimension, with 4 parts on worker 1, and 12 parts on worker 2. Then create a 3-by-3-by-16 array of zeros. View the codistributed array on the client and display information about the variables.

spmd codist = codistributor1d(3,[4,12]); Z = zeros(3,3,16,codist); Z = Z + spmdIndex; end Z whos

Name Size Bytes Class Attributes

Z 3x3x16 1152 distributed
codist 1x2 265 Composite

Close the pool after you have finished using the codistributed array.

For more details on codistributed arrays, see Working with Codistributed Arrays.

See Also

Topics