Boosted Trees Regression · GitBook (original) (raw)

The Gradient Boosted Regression Trees (GBRT) model (also called Gradient Boosted Machine or GBM) is one of the most effective machine learning models for predictive analytics, making it an industrial workhorse for machine learning.

Background

The Boosted Trees Model is a type of additive model that makes predictions by combining decisions from a sequence of base models. More formally we can write this class of models as:

where the final classifier is the sum of simple base classifiers . For boosted trees model, each base classifier is a simple decision tree. This broad technique of using multiple models to obtain better predictive performance is calledmodel ensembling.

Unlike Random Forest which constructs all the base classifier independently, each using a subsample of data, GBRT uses a particular model ensembling technique called gradient boosting.

The name of Gradient Boosting comes from its connection to the Gradient Descent in numerical optimization. Suppose you want to optimize a function , assuming is differentiable, gradient descent works by iteratively find

where is called the step size.

Similarly, if we let be the classifier trained at iteration , and be the empirical loss function, at each iteration we will move towards the negative gradient direction by amount. Hence, is chosen to be

and the algorithm sets .

For regression problems with squared loss function, is simply . The algorithm simply fit a new decision tree to the residual at each iteration.

Introductory Example

In this example, we will use the Mushrooms dataset.1

import turicreate as tc

# Load the data
data =  tc.SFrame.read_csv('https://raw.githubusercontent.com/apple/turicreate/master/src/python/turicreate/test/mushroom.csv')

# Label 'p' is edible
data['label'] = data['label'] == 'p'

# Make a train-test split
train_data, test_data = data.random_split(0.8)

# Create a model.
model = tc.boosted_trees_regression.create(train_data, target='label',
                                           max_iterations=2,
                                           max_depth =  3)

# Save predictions to an SArray
predictions = model.predict(test_data)

# Evaluate the model and save the results into a dictionary
results = model.evaluate(test_data)
Tuning hyperparameters

The Gradient Boosted Trees model has many tuning parameters. Here we provide a simple guideline for tuning the model.

In general, you can choose max_iterations to be large and fit your computation budget. You can then set min_child_weight to be a reasonable value around (#instances/1000), and tune max_depth. When you have more training instances, you can set max_depth to a higher value. When you find a large gap between the training loss and validation loss, a sign of overfitting, you may want to reduce depth, and increase min_child_weight.

When to use a boosted trees model?

Different kinds of models have different advantages. The boosted trees model is very good at handling tabular data with numerical features, or categorical features with fewer than hundreds of categories. Unlike linear models, the boosted trees model are able to capture non-linear interaction between the features and the target.

One important note is that tree based models are not designed to work with very sparse features. When dealing with sparse input data (e.g. categorical features with large dimension), we can either pre-process the sparse features to generate numerical statistics, or switch to a linear model, which is better suited for such scenarios.

results matching ""

No results matching ""