BucketedRandomProjectionLSH (Spark 3.5.5 JavaDoc) (original) (raw)


public class BucketedRandomProjectionLSH
extends Estimator
implements BucketedRandomProjectionLSHParams, HasSeed
This BucketedRandomProjectionLSH implements Locality Sensitive Hashing functions for Euclidean distance metrics.
The input is dense or sparse vectors, each of which represents a point in the Euclidean distance space. The output will be vectors of configurable dimension. Hash values in the same dimension are calculated by the same hash function.
References:
1. Wikipedia on Stable Distributions
2. Wang, Jingdong et al. "Hashing for similarity search: A survey." arXiv preprint arXiv:1408.2927 (2014).
See Also:
Serialized Form