Feature hashing (original) (raw)

In machine learning, feature hashing, also known as the hashing trick (by analogy to the kernel trick), is a fast and space-efficient way of vectorizing features, i.e. turning arbitrary features into indices in a vector or matrix. It works by applying a hash function to the features and using their hash values as indices directly, rather than looking the indices up in an associative array. This trick is often attributed to Weinberger et al. (2009), but there exists a much earlier description of this method published by John Moody in 1989.

Property	Value
dbo:abstract	In machine learning, feature hashing, also known as the hashing trick (by analogy to the kernel trick), is a fast and space-efficient way of vectorizing features, i.e. turning arbitrary features into indices in a vector or matrix. It works by applying a hash function to the features and using their hash values as indices directly, rather than looking the indices up in an associative array. This trick is often attributed to Weinberger et al. (2009), but there exists a much earlier description of this method published by John Moody in 1989. (en) 機械学習において、Feature Hashing（フィーチャーハッシング）は、高速かつ省メモリな特徴量をベクトルに変換する手法であり、任意の特徴をベクトルあるいは行列のインデックスに変換する。kernel trick(カーネルトリック)に似せてHashing Trick（ハッシュトリック）とも呼ばれる。連想配列を走査するのではなく、ハッシュ関数を特徴量に適用し、その値をインデックスとして直接使用する。 (ja)
dbo:wikiPageExternalLink	https://ml.dask.org/modules/generated/dask_ml.feature_extraction.text.HashingVectorizer.html%23dask_ml.feature_extraction.text.HashingVectorizer http://hunch.net/~jl/projects/hash_reps/index.html https://web.archive.org/web/20120609232923/http:/metaoptimize.com/qa/questions/6943/what-is-the-hashing-trick
dbo:wikiPageID	36126852 (xsd:integer)
dbo:wikiPageLength	19896 (xsd:nonNegativeInteger)
dbo:wikiPageRevisionID	1114513799 (xsd:integer)
dbo:wikiPageWikiLink	dbr:Scikit-learn dbr:Bloom_filter dbr:Robert_Burton dbr:Vowpal_Wabbit dbr:Count–min_sketch dbr:One-hot_encoding dbr:Gensim dbr:Multi-task_learning dbr:The_Anatomy_of_Melancholy dbr:Apache_Mahout dbr:Apache_Spark dbr:Machine_learning dbr:Complete_metric_space dbr:Zipf's_law dbr:Feature_(machine_learning) dbr:Kernel_method dbc:Articles_with_example_pseudocode dbc:Machine_learning dbr:Actual_infinity dbr:Trie dbr:Document_classification dbr:Hash_function dbr:Heaps'_law dbr:Linear_model dbr:Locality-sensitive_hashing dbr:Finite_support dbr:Discrete_metric dbc:Hashing dbr:Hash_table dbr:Hilbert_space dbr:TensorFlow dbr:Associative_array dbr:Spam_filter dbr:Sparse_matrix dbr:Term-document_matrix dbr:Inner_product_space dbr:MinHash dbr:R_(programming_language) dbr:Yahoo!_Research dbr:Kleene_star dbr:Kernel_trick dbr:Type–token_distinction dbr:Polysemy dbr:Bag_of_words
dbp:author	dbr:Robert_Burton
dbp:mathStatement	If the binary hash is unbiased , then is an isometry in expectation: (en)
dbp:name	Theorem (en)
dbp:proof	By linearity of expectation, Now, , since we assumed is unbiased. So we continue (en)
dbp:source	Part 2, Sect. II, Mem. IV. (en)
dbp:text	By this art you may contemplate the variation of the 23 letters... (en)
dbp:title	dbr:The_Anatomy_of_Melancholy Proof (en)
dbp:wikiPageUsesTemplate	dbt:Blockquote dbt:Math dbt:Mono dbt:Mvar dbt:Reflist dbt:Proof dbt:Math_theorem
dct:subject	dbc:Articles_with_example_pseudocode dbc:Machine_learning dbc:Hashing
rdfs:comment	In machine learning, feature hashing, also known as the hashing trick (by analogy to the kernel trick), is a fast and space-efficient way of vectorizing features, i.e. turning arbitrary features into indices in a vector or matrix. It works by applying a hash function to the features and using their hash values as indices directly, rather than looking the indices up in an associative array. This trick is often attributed to Weinberger et al. (2009), but there exists a much earlier description of this method published by John Moody in 1989. (en) 機械学習において、Feature Hashing（フィーチャーハッシング）は、高速かつ省メモリな特徴量をベクトルに変換する手法であり、任意の特徴をベクトルあるいは行列のインデックスに変換する。kernel trick(カーネルトリック)に似せてHashing Trick（ハッシュトリック）とも呼ばれる。連想配列を走査するのではなく、ハッシュ関数を特徴量に適用し、その値をインデックスとして直接使用する。 (ja)
rdfs:label	Feature hashing (en) Feature Hashing (ja)
owl:sameAs	freebase:Feature hashing wikidata:Feature hashing dbpedia-ja:Feature hashing https://global.dbpedia.org/id/4jLZN
prov:wasDerivedFrom	wikipedia-en:Feature_hashing?oldid=1114513799&ns=0
foaf:isPrimaryTopicOf	wikipedia-en:Feature_hashing
is dbo:wikiPageRedirects of	dbr:Hash_kernel dbr:Hash_trick dbr:Hashing-Trick dbr:Hashing_trick dbr:Hashtrick
is dbo:wikiPageWikiLink of	dbr:Bloom_filter dbr:Vowpal_Wabbit dbr:Count–min_sketch dbr:Online_machine_learning dbr:Locality-sensitive_hashing dbr:Count_sketch dbr:Streaming_algorithm dbr:Outline_of_machine_learning dbr:Hash_kernel dbr:Hash_trick dbr:Hashing-Trick dbr:Hashing_trick dbr:Hashtrick
is foaf:primaryTopic of	wikipedia-en:Feature_hashing