Locality-sensitive hashing (original) (raw)

Locality sensitive hashing (LSH) est une méthode de recherche approximative dans des espaces de grande dimension. C'est une solution au problème de la malédiction de la dimension qui apparait lors d'une recherche des plus proches voisins en grande dimension. L'idée principale est d'utiliser une famille de fonction de hachage choisies telles que des points proches dans l'espace d'origine aient une forte probabilité d'avoir la même valeur de hachage. La méthode a de nombreuses applications en vision artificielle, traitement automatique de la langue, bio-informatique…[citation nécessaire]

Property	Value
dbo:abstract	In computer science, locality-sensitive hashing (LSH) is an algorithmic technique that hashes similar input items into the same "buckets" with high probability. (The number of buckets is much smaller than the universe of possible input items.) Since similar items end up in the same buckets, this technique can be used for data clustering and nearest neighbor search. It differs from conventional hashing techniques in that hash collisions are maximized, not minimized. Alternatively, the technique can be seen as a way to reduce the dimensionality of high-dimensional data; high-dimensional input items can be reduced to low-dimensional versions while preserving relative distances between items. Hashing-based approximate nearest neighbor search algorithms generally use one of two main categories of hashing methods: either data-independent methods, such as locality-sensitive hashing (LSH); or data-dependent methods, such as locality-preserving hashing (LPH). (en) Locality sensitive hashing (LSH) est une méthode de recherche approximative dans des espaces de grande dimension. C'est une solution au problème de la malédiction de la dimension qui apparait lors d'une recherche des plus proches voisins en grande dimension. L'idée principale est d'utiliser une famille de fonction de hachage choisies telles que des points proches dans l'espace d'origine aient une forte probabilité d'avoir la même valeur de hachage. La méthode a de nombreuses applications en vision artificielle, traitement automatique de la langue, bio-informatique…[citation nécessaire] (fr) 局所性鋭敏型ハッシュ（きょくしょせいえいびんがたハッシュ、英語: locality sensitive hashing）とは高次元のデータを確率的な処理によって次元圧縮するための手法である。ハッシュの基本的な考え方は類似したデータが高確率で同じバケットに入るようにデータを整理するというものである。多くの場合においてこのバケットの数は入力されるデータサンプルの数よりもずっと小さくなる。 (ja) Il locality-sensitive hashing (LSH) è un metodo per la riduzione della dimensionalità dello spazio vettoriale di un insieme di dati. (it) Locality-sensitive hashing (LSH) — вероятностный метод понижения размерности многомерных данных. Основная идея состоит в таком подборе хеш-функций для некоторых измерений, чтобы похожие объекты с высокой степенью вероятности попадали в одну корзину. Один из способов борьбы с «проклятием размерности» при поиске и анализе многомерных данных, которое заключается в том, что при росте размерности исходных данных поиск по индексу ведёт себя хуже, чем последовательный просмотр. Метод позволяет строить структуру для быстрого приближённого (вероятностного) поиска n-мерных векторов, «похожих» на искомый шаблон. LSH является одним из наиболее популярных на сегодняшний день приближённых алгоритмов поиска ближайших соседей (Approximate Nearest Neighbor, ANN). LSH в этом подходе отображает множество точек в высокоразмерном пространстве в множество ячеек, т. е. в хеш-таблицу. В отличие от традиционных хешей, LSH обладает свойством чувствительности к местоположению (locality-sensitive hash), благодаря чему способен помещать соседние точки в одну и ту же ячейку. Преимуществами LSH являются: 1) простота использования; 2) строгая теория, подтверждающая хорошую производительность алгоритма; 3) LSH совместим с любой нормой при . LSH можно использовать с евклидовой метрикой и с манхэттенским расстоянием. Существуют также варианты для расстояния Хэмминга и косинусного коэффициента. (ru)
dbo:thumbnail	wiki-commons:Special:FilePath/One_minus_theta_over_pi_vs_cos_of_theta.gif?width=300
dbo:wikiPageExternalLink	https://web.archive.org/web/20101203074412/http:/www.vision.caltech.edu/malaa/software/research/image-search/ http://lshkit.sourceforge.net/ http://web.mit.edu/andoni/www/LSH/index.html http://www.unclaw.com/chin/scholarship/hashfunctions.pdf%7C https://github.com/DBWangGroupUNSW/SRS https://github.com/RSIA-LIESMARS-WHU/LSHBOX https://github.com/idealista/tlsh https://github.com/idealista/tlsh-js https://github.com/salviati/slash https://github.com/simonemainardi/LSHash https://github.com/trendmicro/tlsh
dbo:wikiPageID	11634012 (xsd:integer)
dbo:wikiPageLength	29727 (xsd:nonNegativeInteger)
dbo:wikiPageRevisionID	1121220653 (xsd:integer)
dbo:wikiPageWikiLink	dbr:Probability_distribution dbr:Nearest_neighbor_search dbr:Cosine_distance dbr:Bloom_filter dbr:Algorithmica dbc:Probabilistic_data_structures dbr:Resource_contention dbr:Curse_of_dimensionality dbr:Sparse_distributed_memory dbc:Classification_algorithms dbc:Search_algorithms dbr:Cryptography dbr:Gene_expression dbr:Genome-wide_association_study dbr:Geohash dbr:Random_indexing dbr:VisualRank dbr:Rolling_hash dbr:String_metric dbr:Anti-spam_techniques dbr:Singular_value_decomposition dbr:Stable_distribution dbr:Cluster_analysis dbr:Computer_science dbr:Feature_hashing dbr:Pipeline_(computing) dbr:Wavelet_compression dbr:Hash_collision dbr:Hash_function dbr:Locality-preserving_hashing dbr:Hash_functions dbr:File:One_minus_theta_over_pi_vs_cos_of_theta.gif dbr:Checksum dbr:Digital_video_fingerprinting dbr:Discrete_uniform_distribution dbr:Graphical_model dbr:Network_congestion dbr:Principal_component_analysis dbr:Dimension_reduction dbc:Hashing dbr:Hamming_distance dbr:Hyperplane dbr:Artificial_neural_network dbr:Hierarchical_clustering dbr:Jaccard_index dbr:Avalanche_effect dbc:Dimension_reduction dbr:Metric_space dbr:Open-source_software dbr:Space-filling_curve dbr:Symposium_on_Theory_of_Computing dbr:Semantic_similarity dbr:Symmetric_group dbr:Random_permutation dbr:Universal_hashing dbr:Moses_Charikar dbr:Multilinear_subspace_learning dbr:SimHash dbr:Parallel_RAM dbr:Fourier-related_transforms dbr:Audio_fingerprint dbr:Jaccard_similarity
dbp:wikiPageUsesTemplate	dbt:Citation_needed dbt:Cite_conference dbt:Cite_journal dbt:Clarify dbt:Cn dbt:How dbt:ISBN dbt:Main dbt:Mvar dbt:Reflist
dct:subject	dbc:Probabilistic_data_structures dbc:Classification_algorithms dbc:Search_algorithms dbc:Hashing dbc:Dimension_reduction
rdf:type	yago:WikicatClassificationAlgorithms yago:WikicatSearchAlgorithms yago:Abstraction100002137 yago:Act100030358 yago:Activity100407535 yago:Algorithm105847438 yago:Arrangement105726596 yago:Cognition100023271 yago:DataStructure105728493 yago:Event100029378 yago:Procedure101023820 yago:PsychologicalFeature100023100 yago:YagoPermanentlyLocatedEntity yago:Rule105846932 yago:Structure105726345 yago:WikicatAlgorithms yago:WikicatDataStructures yago:WikicatProbabilisticDataStructures
rdfs:comment	Locality sensitive hashing (LSH) est une méthode de recherche approximative dans des espaces de grande dimension. C'est une solution au problème de la malédiction de la dimension qui apparait lors d'une recherche des plus proches voisins en grande dimension. L'idée principale est d'utiliser une famille de fonction de hachage choisies telles que des points proches dans l'espace d'origine aient une forte probabilité d'avoir la même valeur de hachage. La méthode a de nombreuses applications en vision artificielle, traitement automatique de la langue, bio-informatique…[citation nécessaire] (fr) 局所性鋭敏型ハッシュ（きょくしょせいえいびんがたハッシュ、英語: locality sensitive hashing）とは高次元のデータを確率的な処理によって次元圧縮するための手法である。ハッシュの基本的な考え方は類似したデータが高確率で同じバケットに入るようにデータを整理するというものである。多くの場合においてこのバケットの数は入力されるデータサンプルの数よりもずっと小さくなる。 (ja) Il locality-sensitive hashing (LSH) è un metodo per la riduzione della dimensionalità dello spazio vettoriale di un insieme di dati. (it) In computer science, locality-sensitive hashing (LSH) is an algorithmic technique that hashes similar input items into the same "buckets" with high probability. (The number of buckets is much smaller than the universe of possible input items.) Since similar items end up in the same buckets, this technique can be used for data clustering and nearest neighbor search. It differs from conventional hashing techniques in that hash collisions are maximized, not minimized. Alternatively, the technique can be seen as a way to reduce the dimensionality of high-dimensional data; high-dimensional input items can be reduced to low-dimensional versions while preserving relative distances between items. (en) Locality-sensitive hashing (LSH) — вероятностный метод понижения размерности многомерных данных. Основная идея состоит в таком подборе хеш-функций для некоторых измерений, чтобы похожие объекты с высокой степенью вероятности попадали в одну корзину. Один из способов борьбы с «проклятием размерности» при поиске и анализе многомерных данных, которое заключается в том, что при росте размерности исходных данных поиск по индексу ведёт себя хуже, чем последовательный просмотр. Метод позволяет строить структуру для быстрого приближённого (вероятностного) поиска n-мерных векторов, «похожих» на искомый шаблон. (ru)
rdfs:label	Locality sensitive hashing (fr) Locality-sensitive hashing (it) 局所性鋭敏型ハッシュ (ja) Locality-sensitive hashing (en) Locality-sensitive hashing (ru)
owl:sameAs	freebase:Locality-sensitive hashing yago-res:Locality-sensitive hashing wikidata:Locality-sensitive hashing dbpedia-fr:Locality-sensitive hashing dbpedia-it:Locality-sensitive hashing dbpedia-ja:Locality-sensitive hashing dbpedia-ru:Locality-sensitive hashing https://global.dbpedia.org/id/cGRA
prov:wasDerivedFrom	wikipedia-en:Locality-sensitive_hashing?oldid=1121220653&ns=0
foaf:depiction	wiki-commons:Special:FilePath/One_minus_theta_over_pi_vs_cos_of_theta.gif
foaf:isPrimaryTopicOf	wikipedia-en:Locality-sensitive_hashing
is dbo:wikiPageDisambiguates of	dbr:LSH
is dbo:wikiPageRedirects of	dbr:Fuzzy_hashing dbr:Applications_of_locality-sensitive_hashing dbr:Locality-preserving_hashing dbr:Locality_Sensitive_Hashing dbr:Locality_preserving_hashing dbr:Locality-Sensitive_Hashing dbr:Locality-sensitive_hash dbr:Locality_sensitive_hashing
is dbo:wikiPageWikiLink of	dbr:List_of_algorithms dbr:Nearest_neighbor_search dbr:Andrei_Broder dbr:Approximate_string_matching dbr:Jubatus dbr:Duplicate_code dbr:Levenshtein_distance dbr:Piotr_Indyk dbr:Sparse_distributed_memory dbr:Count–min_sketch dbr:Random_indexing dbr:VisualRank dbr:Fuzzy_hashing dbr:Gensim dbr:Geocode dbr:Andrew_Tridgell dbr:Applications_of_locality-sensitive_hashing dbr:Singular_value_decomposition dbr:Feature_hashing dbr:Michael_Mitzenmacher dbr:Hadamard_transform dbr:Hash_buster dbr:Hash_collision dbr:Hash_filter dbr:Locality-preserving_hashing dbr:Nilsimsa_Hash dbr:Differentiable_neural_computer dbr:Dimensionality_reduction dbr:Farthest-first_traversal dbr:Hilbert_curve dbr:Bag-of-words_model_in_computer_vision dbr:Collaborative_filtering dbr:Hierarchical_clustering dbr:Transformer_(machine_learning_model) dbr:MinHash dbr:Rajeev_Motwani dbr:Random_projection dbr:Nicole_Immorlica dbr:Locality_Sensitive_Hashing dbr:Locality_preserving_hashing dbr:Siamese_neural_network dbr:LSH dbr:Moses_Charikar dbr:SimHash dbr:Outline_of_machine_learning dbr:Paris_Kanellakis_Award dbr:Locality-Sensitive_Hashing dbr:Locality-sensitive_hash dbr:Locality_sensitive_hashing
is foaf:primaryTopic of	wikipedia-en:Locality-sensitive_hashing