Slicing Models — xgboost 3.1.0-dev documentation (original) (raw)
Slice tree model
When booster
is set to gbtree
or dart
, XGBoost builds a tree model, which is a list of trees and can be sliced into multiple sub-models.
import xgboost as xgb from sklearn.datasets import make_classification num_classes = 3 X, y = make_classification(n_samples=1000, n_informative=5, n_classes=num_classes) dtrain = xgb.DMatrix(data=X, label=y) num_parallel_tree = 4 num_boost_round = 16
total number of built trees is num_parallel_tree * num_classes * num_boost_round
We build a boosted random forest for classification here.
booster = xgb.train({ 'num_parallel_tree': 4, 'subsample': 0.5, 'num_class': 3}, num_boost_round=num_boost_round, dtrain=dtrain)
This is the sliced model, containing [3, 7) forests
step is also supported with some limitations like negative step is invalid.
sliced: xgb.Booster = booster[3:7]
Access individual tree layer
trees = [_ for _ in booster] assert len(trees) == num_boost_round
The sliced model is a copy of selected trees, that means the model itself is immutable during slicing. This feature is the basis of save_best
option in early stopping callback. See Demo for prediction using individual trees and model slices for a worked example on how to combine prediction with sliced trees.