partial_dependence (original) (raw)

sklearn.inspection.partial_dependence(estimator, X, features, *, sample_weight=None, categorical_features=None, feature_names=None, response_method='auto', percentiles=(0.05, 0.95), grid_resolution=100, method='auto', kind='average')[source]#

Partial dependence of features.

Partial dependence of a feature (or a set of features) corresponds to the average response of an estimator for each possible value of the feature.

Read more in the User Guide.

Parameters:

estimatorBaseEstimator

A fitted estimator object implementing predict,predict_proba, or decision_function. Multioutput-multiclass classifiers are not supported.

X{array-like, sparse matrix or dataframe} of shape (n_samples, n_features)

X is used to generate a grid of values for the targetfeatures (where the partial dependence will be evaluated), and also to generate values for the complement features when themethod is ‘brute’.

featuresarray-like of {int, str, bool} or int or str

The feature (e.g. [0]) or pair of interacting features (e.g. [(0, 1)]) for which the partial dependency should be computed.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights are used to calculate weighted means when averaging the model output. If None, then samples are equally weighted. Ifsample_weight is not None, then method will be set to 'brute'. Note that sample_weight is ignored for kind='individual'.

Added in version 1.3.

categorical_featuresarray-like of shape (n_features,) or shape (n_categorical_features,), dtype={bool, int, str}, default=None

Indicates the categorical features.

Added in version 1.2.

feature_namesarray-like of shape (n_features,), dtype=str, default=None

Name of each feature; feature_names[i] holds the name of the feature with index i. By default, the name of the feature corresponds to their numerical index for NumPy array and their column name for pandas dataframe.

Added in version 1.2.

response_method{‘auto’, ‘predict_proba’, ‘decision_function’}, default=’auto’

Specifies whether to use predict_proba ordecision_function as the target response. For regressors this parameter is ignored and the response is always the output ofpredict. By default, predict_proba is tried first and we revert to decision_function if it doesn’t exist. Ifmethod is ‘recursion’, the response is always the output ofdecision_function.

percentilestuple of float, default=(0.05, 0.95)

The lower and upper percentile used to create the extreme values for the grid. Must be in [0, 1].

grid_resolutionint, default=100

The number of equally spaced points on the grid, for each target feature.

method{‘auto’, ‘recursion’, ‘brute’}, default=’auto’

The method used to calculate the averaged predictions:

Please see this note for differences between the 'brute' and 'recursion' method.

kind{‘average’, ‘individual’, ‘both’}, default=’average’

Whether to return the partial dependence averaged across all the samples in the dataset or one value per sample or both. See Returns below.

Note that the fast method='recursion' option is only available forkind='average' and sample_weights=None. Computing individual dependencies and doing weighted averages requires using the slowermethod='brute'.

Added in version 0.24.

Returns:

predictionsBunch

Dictionary-like object, with the following attributes.

individualndarray of shape (n_outputs, n_instances, len(values[0]), len(values[1]), …)

The predictions for all the points in the grid for all samples in X. This is also known as Individual Conditional Expectation (ICE). Only available when kind='individual' or kind='both'.

averagendarray of shape (n_outputs, len(values[0]), len(values[1]), …)

The predictions for all the points in the grid, averaged over all samples in X (or over the training data ifmethod is ‘recursion’). Only available when kind='average' or kind='both'.

grid_valuesseq of 1d ndarrays

The values with which the grid has been created. The generated grid is a cartesian product of the arrays in grid_values wherelen(grid_values) == len(features). The size of each arraygrid_values[j] is either grid_resolution, or the number of unique values in X[:, j], whichever is smaller.

Added in version 1.3.

n_outputs corresponds to the number of classes in a multi-class setting, or to the number of tasks for multi-output regression. For classical regression and binary classification n_outputs==1.n_values_feature_j corresponds to the size grid_values[j].

Examples

X = [[0, 0, 2], [1, 0, 0]] y = [0, 1] from sklearn.ensemble import GradientBoostingClassifier gb = GradientBoostingClassifier(random_state=0).fit(X, y) partial_dependence(gb, features=[0], X=X, percentiles=(0, 1), ... grid_resolution=2) (array([[-4.52..., 4.52...]]), [array([ 0., 1.])])