sklearn.feature_extraction (original) (raw)

Feature extraction from raw data.

User guide. See the Feature extraction section for further details.

DictVectorizer Transforms lists of feature-value mappings to vectors.
FeatureHasher Implements feature hashing, aka the hashing trick.

From images#

Utilities to extract features from images.

image.PatchExtractor Extracts patches from a collection of images.
image.extract_patches_2d Reshape a 2D image into a collection of patches.
image.grid_to_graph Graph of the pixel-to-pixel connections.
image.img_to_graph Graph of the pixel-to-pixel gradient connections.
image.reconstruct_from_patches_2d Reconstruct the image from all of its patches.

From text#

Utilities to build feature vectors from text documents.

text.CountVectorizer Convert a collection of text documents to a matrix of token counts.
text.HashingVectorizer Convert a collection of text documents to a matrix of token occurrences.
text.TfidfTransformer Transform a count matrix to a normalized tf or tf-idf representation.
text.TfidfVectorizer Convert a collection of raw documents to a matrix of TF-IDF features.