lightgbm.LGBMModel — LightGBM 4.6.0.99 documentation (original) (raw)

LightGBM

class lightgbm.LGBMModel(*, boosting_type='gbdt', num_leaves=31, max_depth=-1, learning_rate=0.1, n_estimators=100, subsample_for_bin=200000, objective=None, class_weight=None, min_split_gain=0.0, min_child_weight=0.001, min_child_samples=20, subsample=1.0, subsample_freq=0, colsample_bytree=1.0, reg_alpha=0.0, reg_lambda=0.0, random_state=None, n_jobs=None, importance_type='split', **kwargs)[source]

Bases: BaseEstimator

Implementation of the scikit-learn API for LightGBM.

__init__(*, boosting_type='gbdt', num_leaves=31, max_depth=-1, learning_rate=0.1, n_estimators=100, subsample_for_bin=200000, objective=None, class_weight=None, min_split_gain=0.0, min_child_weight=0.001, min_child_samples=20, subsample=1.0, subsample_freq=0, colsample_bytree=1.0, reg_alpha=0.0, reg_lambda=0.0, random_state=None, n_jobs=None, importance_type='split', **kwargs)[source]

Construct a gradient boosting model.

Parameters:

Note

A custom objective function can be provided for the objective parameter. In this case, it should have the signatureobjective(y_true, y_pred) -> grad, hess,objective(y_true, y_pred, weight) -> grad, hessor objective(y_true, y_pred, weight, group) -> grad, hess:

y_truenumpy 1-D array of shape = [n_samples]

The target values.

y_prednumpy 1-D array of shape = [n_samples] or numpy 2-D array of shape = [n_samples, n_classes] (for multi-class task)

The predicted values. Predicted values are returned before any transformation, e.g. they are raw margin instead of probability of positive class for binary task.

weightnumpy 1-D array of shape = [n_samples]

The weight of samples. Weights should be non-negative.

groupnumpy 1-D array

Group/query data. Only used in the learning-to-rank task. sum(group) = n_samples. For example, if you have a 100-document dataset with group = [10, 20, 40, 10, 10, 10], that means that you have 6 groups, where the first 10 records are in the first group, records 11-30 are in the second group, records 31-70 are in the third group, etc.

gradnumpy 1-D array of shape = [n_samples] or numpy 2-D array of shape = [n_samples, n_classes] (for multi-class task)

The value of the first order derivative (gradient) of the loss with respect to the elements of y_pred for each sample point.

hessnumpy 1-D array of shape = [n_samples] or numpy 2-D array of shape = [n_samples, n_classes] (for multi-class task)

The value of the second order derivative (Hessian) of the loss with respect to the elements of y_pred for each sample point.

For multi-class task, y_pred is a numpy 2-D array of shape = [n_samples, n_classes], and grad and hess should be returned in the same format.

Methods

__init__(*[, boosting_type, num_leaves, ...]) Construct a gradient boosting model.
fit(X, y[, sample_weight, init_score, ...]) Build a gradient boosting model from the training set (X, y).
get_metadata_routing() Get metadata routing of this object.
get_params([deep]) Get parameters for this estimator.
predict(X[, raw_score, start_iteration, ...]) Return the predicted value for each sample.
set_fit_request(*[, callbacks, ...]) Request metadata passed to the fit method.
set_params(**params) Set the parameters of this estimator.
set_predict_request(*[, num_iteration, ...]) Request metadata passed to the predict method.

Attributes

best_iteration_ The best iteration of fitted model if early_stopping() callback has been specified.
best_score_ The best score of fitted model.
booster_ The underlying Booster of this model.
evals_result_ The evaluation results if validation sets have been specified.
feature_importances_ The feature importances (the higher, the more important).
feature_name_ The names of features.
feature_names_in_ scikit-learn compatible version of .feature_name_.
n_estimators_ True number of boosting iterations performed.
n_features_ The number of features of fitted model.
n_features_in_ The number of features of fitted model.
n_iter_ True number of boosting iterations performed.
objective_ The concrete objective used while fitting this model.

property best_iteration_

The best iteration of fitted model if early_stopping() callback has been specified.

Type:

int

property best_score_

The best score of fitted model.

Type:

dict

property booster_

The underlying Booster of this model.

Type:

Booster

property evals_result_

The evaluation results if validation sets have been specified.

Type:

dict

property feature_importances_

The feature importances (the higher, the more important).

Note

importance_type attribute is passed to the function to configure the type of importance values to be extracted.

Type:

array of shape = [n_features]

property feature_name_

The names of features.

Note

If input does not contain feature names, they will be added during fitting in the format Column_0, Column_1, …, Column_N.

Type:

list of shape = [n_features]

property feature_names_in_

scikit-learn compatible version of .feature_name_.

Added in version 4.5.0.

Type:

array of shape = [n_features]

fit(X, y, sample_weight=None, init_score=None, group=None, eval_set=None, eval_names=None, eval_sample_weight=None, eval_class_weight=None, eval_init_score=None, eval_group=None, eval_metric=None, feature_name='auto', categorical_feature='auto', callbacks=None, init_model=None)[source]

Build a gradient boosting model from the training set (X, y).

Parameters:

Returns:

self – Returns self.

Return type:

LGBMModel

Note

Custom eval function expects a callable with following signatures:func(y_true, y_pred), func(y_true, y_pred, weight) orfunc(y_true, y_pred, weight, group)and returns (eval_name, eval_result, is_higher_better) or list of (eval_name, eval_result, is_higher_better):

y_truenumpy 1-D array of shape = [n_samples]

The target values.

y_prednumpy 1-D array of shape = [n_samples] or numpy 2-D array of shape = [n_samples, n_classes] (for multi-class task)

The predicted values. In case of custom objective, predicted values are returned before any transformation, e.g. they are raw margin instead of probability of positive class for binary task in this case.

weightnumpy 1-D array of shape = [n_samples]

The weight of samples. Weights should be non-negative.

groupnumpy 1-D array

Group/query data. Only used in the learning-to-rank task. sum(group) = n_samples. For example, if you have a 100-document dataset with group = [10, 20, 40, 10, 10, 10], that means that you have 6 groups, where the first 10 records are in the first group, records 11-30 are in the second group, records 31-70 are in the third group, etc.

eval_namestr

The name of evaluation function (without whitespace).

eval_resultfloat

The eval result.

is_higher_betterbool

Is eval result higher better, e.g. AUC is is_higher_better.

get_metadata_routing()

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routing – A MetadataRequest encapsulating routing information.

Return type:

MetadataRequest

get_params(deep=True)[source]

Get parameters for this estimator.

Parameters:

deep (bool , optional ( default=True )) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

property n_estimators_

True number of boosting iterations performed.

This might be less than parameter n_estimators if early stopping was enabled or if boosting stopped early due to limits on complexity like min_gain_to_split.

Added in version 4.0.0.

Type:

int

property n_features_

The number of features of fitted model.

Type:

int

property n_features_in_

The number of features of fitted model.

Type:

int

property n_iter_

True number of boosting iterations performed.

This might be less than parameter n_estimators if early stopping was enabled or if boosting stopped early due to limits on complexity like min_gain_to_split.

Added in version 4.0.0.

Type:

int

property objective_

The concrete objective used while fitting this model.

Type:

str or callable

predict(X, raw_score=False, start_iteration=0, num_iteration=None, pred_leaf=False, pred_contrib=False, validate_features=False, **kwargs)[source]

Return the predicted value for each sample.

Parameters:

Returns:

set_fit_request(*, callbacks='$UNCHANGED$', categorical_feature='$UNCHANGED$', eval_class_weight='$UNCHANGED$', eval_group='$UNCHANGED$', eval_init_score='$UNCHANGED$', eval_metric='$UNCHANGED$', eval_names='$UNCHANGED$', eval_sample_weight='$UNCHANGED$', eval_set='$UNCHANGED$', feature_name='$UNCHANGED$', group='$UNCHANGED$', init_model='$UNCHANGED$', init_score='$UNCHANGED$', sample_weight='$UNCHANGED$')

Request metadata passed to the fit method.

Note that this method is only relevant ifenable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside aPipeline. Otherwise it has no effect.

Parameters:

Returns:

self – The updated object.

Return type:

object

set_params(**params)[source]

Set the parameters of this estimator.

Parameters:

**params – Parameter names with their new values.

Returns:

self – Returns self.

Return type:

object

set_predict_request(*, num_iteration='$UNCHANGED$', pred_contrib='$UNCHANGED$', pred_leaf='$UNCHANGED$', raw_score='$UNCHANGED$', start_iteration='$UNCHANGED$', validate_features='$UNCHANGED$')

Request metadata passed to the predict method.

Note that this method is only relevant ifenable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside aPipeline. Otherwise it has no effect.

Parameters:

Returns:

self – The updated object.

Return type:

object