numpy.random.multivariate_normal — NumPy v1.16 Manual (original) (raw)

numpy.random. multivariate_normal(mean, _cov_[, size, check_valid, _tol_])¶

Draw random samples from a multivariate normal distribution.

The multivariate normal, multinormal or Gaussian distribution is a generalization of the one-dimensional normal distribution to higher dimensions. Such a distribution is specified by its mean and covariance matrix. These parameters are analogous to the mean (average or “center”) and variance (standard deviation, or “width,” squared) of the one-dimensional normal distribution.

Parameters:	mean : 1-D array_like, of length N Mean of the N-dimensional distribution. cov : 2-D array_like, of shape (N, N) Covariance matrix of the distribution. It must be symmetric and positive-semidefinite for proper sampling. size : int or tuple of ints, optional Given a shape of, for example, (m,n,k), mnk samples are generated, and packed in an _m_-by-_n_-by-k arrangement. Because each sample is _N_-dimensional, the output shape is (m,n,k,N). If no shape is specified, a single (_N_-D) sample is returned. check_valid : { ‘warn’, ‘raise’, ‘ignore’ }, optional Behavior when the covariance matrix is not positive semidefinite. tol : float, optional Tolerance when checking the singular values in covariance matrix.
Returns:	out : ndarray The drawn samples, of shape size, if that was provided. If not, the shape is (N,). In other words, each entry out[i,j,...,:] is an N-dimensional value drawn from the distribution.

Parameters:

mean : 1-D array_like, of length N Mean of the N-dimensional distribution. cov : 2-D array_like, of shape (N, N) Covariance matrix of the distribution. It must be symmetric and positive-semidefinite for proper sampling. size : int or tuple of ints, optional Given a shape of, for example, (m,n,k), m*n*k samples are generated, and packed in an _m_-by-_n_-by-k arrangement. Because each sample is _N_-dimensional, the output shape is (m,n,k,N). If no shape is specified, a single (_N_-D) sample is returned. check_valid : { ‘warn’, ‘raise’, ‘ignore’ }, optional Behavior when the covariance matrix is not positive semidefinite. tol : float, optional Tolerance when checking the singular values in covariance matrix.

Returns:

out : ndarray The drawn samples, of shape size, if that was provided. If not, the shape is (N,). In other words, each entry out[i,j,...,:] is an N-dimensional value drawn from the distribution.

Notes

The mean is a coordinate in N-dimensional space, which represents the location where samples are most likely to be generated. This is analogous to the peak of the bell curve for the one-dimensional or univariate normal distribution.

Covariance indicates the level to which two variables vary together. From the multivariate normal distribution, we draw N-dimensional samples, $X = [x_1, x_2, ... x_N]$ . The covariance matrix element $C_{ij}$ is the covariance of $x_i$ and $x_j$ . The element $C_{ii}$ is the variance of $x_i$ (i.e. its “spread”).

Instead of specifying the full covariance matrix, popular approximations include:

Spherical covariance (cov is a multiple of the identity matrix)

Diagonal covariance (cov has non-negative elements, and only on the diagonal)

This geometrical property can be seen in two dimensions by plotting generated data-points:

mean = [0, 0] cov = [[1, 0], [0, 100]] # diagonal covariance

Diagonal covariance means that points are oriented along x or y-axis:

import matplotlib.pyplot as plt x, y = np.random.multivariate_normal(mean, cov, 5000).T plt.plot(x, y, 'x') plt.axis('equal') plt.show()

Note that the covariance matrix must be positive semidefinite (a.k.a. nonnegative-definite). Otherwise, the behavior of this method is undefined and backwards compatibility is not guaranteed.

References

[1]	Papoulis, A., “Probability, Random Variables, and Stochastic Processes,” 3rd ed., New York: McGraw-Hill, 1991.

[2]	Duda, R. O., Hart, P. E., and Stork, D. G., “Pattern Classification,” 2nd ed., New York: Wiley, 2001.

Examples

mean = (1, 2) cov = [[1, 0], [0, 1]] x = np.random.multivariate_normal(mean, cov, (3, 3)) x.shape (3, 3, 2)

The following is probably true, given that 0.6 is roughly twice the standard deviation:

list((x[0,0,:] - mean) < 0.6) [True, True]