SciPy Spatial Distance Matrix (original) (raw)

A distance matrix contains the distances computed pairwise between the vectors of matrix/ matrices. scipy.spatial package provides us distance_matrix() method to compute the distance matrix. Generally matrices are in the form of 2-D array and the vectors of the matrix are matrix rows ( 1-D array).

**Example:

Python `

from scipy.spatial import distance_matrix import numpy as np

A = np.array([[1, 2], [3, 4]]) B = np.array([[5, 6], [7, 8]])

res = distance_matrix(A, B) print(res)

`

**Output

[[5.65685425 8.48528137]
[2.82842712 5.65685425]]

**Explanation: This computes the Euclidean distance (default, p =2) between each pair of points in A and B. For instance, the first element 5.65685425 is the distance between [1, 2] and [5, 6].

Syntax of scipy.spatial.distance_matrix()

scipy.spatial.distance_matrix(XA, XB, p=2)

**Parameters:

**Returns: distances (ndarray) – An m × k matrix where each element [i, j] represents the distance between XA[i] and XB[j].

**Raises:

**Note: The number of columns (dimensions) in both XA and XB must be the same. You can use different values of p to compute various distance metrics:

Examples

**Example 1: Manhattan Distance (p =1)

Python `

from scipy.spatial import distance_matrix import numpy as np

A = np.array([[1, 2]]) B = np.array([[3, 4], [5, 6]])

res= distance_matrix(A, B, p=1) print(res)

`

**Output

[[4. 8.]]

**Explanation: Here, the Manhattan distance is used. The first value 4 is computed as |1-3| + |2-4| = 2 + 2.

**Example 2: Chebyshev Distance (p =∞)

Python `

from scipy.spatial import distance_matrix import numpy as np

A = np.array([[1, 2]]) B = np.array([[4, 6], [7, 3]])

res = distance_matrix(A, B, p=np.inf) print(res)

`

Output

[[4. 6.]]

**Explanation: The Chebyshev distance takes the maximum absolute difference in any dimension. For [1, 2] and [4, 6], it's max(|1-4|, |2-6|) = max(3, 4) = 4.

**Example 3: Custom Distance (p =3)

Python `

from scipy.spatial import distance_matrix import numpy as np

A = np.array([[1, 2]]) B = np.array([[4, 6]])

res = distance_matrix(A, B, p=3) print(res)

`

**Output

[[4.49794145]]

**Explanation: This uses Minkowski distance with p=3, which gives a value between the Manhattan and Euclidean distances.

Using scipy.spatial.distance.cdist()

While distance_matrix() is useful for computing pairwise distances using the Minkowski metric, cdist() from scipy.spatial.distance provides greater flexibility by supporting a wide range of distance metrics (e.g., cosine, correlation, cityblock, etc.)

Syntax

scipy.spatial.distance.cdist(XA, XB, metric='euclidean')

**Parameters:

**Returns: distances (ndarray): An m × k matrix of distances.

Examples

**Example 1: Cosine distance

Python `

from scipy.spatial.distance import cdist import numpy as np

A = np.array([[1, 0], [0, 1]]) B = np.array([[1, 1]])

res = cdist(A, B, metric='cosine') print(res)

`

**Output

[[0.29289322]
[0.29289322]]

**Explanation: Cosine distance measures the angular difference between vectors. Both vectors in A form a 45° angle with the vector in B, hence the equal distances.

**Example 2: Cityblock (Manhattan) Distance

Python `

from scipy.spatial.distance import cdist import numpy as np

A = np.array([[1, 2]]) B = np.array([[4, 6], [5, 1]])

res = cdist(A, B, metric='cityblock') print(res)

`

**Output

[[7. 5.]]

**Explanation: This uses the cityblock (Manhattan) metric. For [1, 2] and [4, 6], distance = |1−4| + |2−6| = 3 + 4 = 7.