cosine_similarity (original) (raw)

sklearn.metrics.pairwise.cosine_similarity(X, Y=None, dense_output=True)[source]#

Compute cosine similarity between samples in X and Y.

Cosine similarity, or the cosine kernel, computes similarity as the normalized dot product of X and Y:

K(X, Y) = <X, Y> / (||X||*||Y||)

On L2-normalized data, this function is equivalent to linear_kernel.

Read more in the User Guide.

Parameters:

X{array-like, sparse matrix} of shape (n_samples_X, n_features)

Input data.

Y{array-like, sparse matrix} of shape (n_samples_Y, n_features), default=None

Input data. If None, the output will be the pairwise similarities between all samples in X.

dense_outputbool, default=True

Whether to return dense output even when the input is sparse. IfFalse, the output is sparse if both input arrays are sparse.

Added in version 0.17: parameter dense_output for dense output.

Returns:

similaritiesndarray or sparse matrix of shape (n_samples_X, n_samples_Y)

Returns the cosine similarity between samples in X and Y.

Examples

from sklearn.metrics.pairwise import cosine_similarity X = [[0, 0, 0], [1, 1, 1]] Y = [[1, 0, 0], [1, 1, 0]] cosine_similarity(X, Y) array([[0. , 0. ], [0.577, 0.816]])