Recursive feature elimination with cross-validation — scikit-learn 0.20.4 documentation (original) (raw)

Note

Click here to download the full example code

A recursive feature elimination example with automatic tuning of the number of features selected with cross-validation.

../../_images/sphx_glr_plot_rfe_with_cross_validation_001.png

Out:

Optimal number of features : 3

print(doc)

import matplotlib.pyplot as plt from sklearn.svm import SVC from sklearn.model_selection import StratifiedKFold from sklearn.feature_selection import RFECV from sklearn.datasets import make_classification

Build a classification task using 3 informative features

X, y = make_classification(n_samples=1000, n_features=25, n_informative=3, n_redundant=2, n_repeated=0, n_classes=8, n_clusters_per_class=1, random_state=0)

Create the RFE object and compute a cross-validated score.

svc = SVC(kernel="linear")

The "accuracy" scoring is proportional to the number of correct

classifications

rfecv = RFECV(estimator=svc, step=1, cv=StratifiedKFold(2), scoring='accuracy') rfecv.fit(X, y)

print("Optimal number of features : %d" % rfecv.n_features_)

Plot number of features VS. cross-validation scores

plt.figure() plt.xlabel("Number of features selected") plt.ylabel("Cross validation score (nb of correct classifications)") plt.plot(range(1, len(rfecv.grid_scores_) + 1), rfecv.grid_scores_) plt.show()

Total running time of the script: ( 0 minutes 1.721 seconds)

Gallery generated by Sphinx-Gallery