MinHashLSHModel (Spark 3.5.5 JavaDoc) (original) (raw)


public class MinHashLSHModel
extends Model
Model produced by MinHashLSH, where multiple hash functions are stored. Each hash function is picked from the following family of hash functions, where a_i and b_i are randomly chosen integers less than prime:h_i(x) = ((x \cdot a_i + b_i) \mod prime)
This hash family is approximately min-wise independent according to the reference.
Reference: Tom Bohman, Colin Cooper, and Alan Frieze. "Min-wise independent linear permutations." Electronic Journal of Combinatorics 7 (2000): R26.
param: randCoefficients Pairs of random coefficients. Each pair is used by one hash function.
See Also:
Serialized Form