Feature Selection | Embedded methods (original) (raw)

Last Updated : 23 Jul, 2025

In machine learning, having too many features (also called variables or columns) can lead to complex models that are hard to understand and may not perform well. Feature selection helps us choose only the most important features, making models faster, simpler, and often more accurate.

There are three main types of feature selection methods:

  1. Filter methods
  2. Wrapper methods
  3. Embedded methods

What Are Embedded Methods?

Embedded methods combine the best parts of filter and wrapper methods. They choose important features as the model is being trained. This makes them faster than wrapper methods and often more accurate than filter methods.

These methods are usually part of the learning algorithm itself. Examples include decision trees, regularization methods like Lasso, and some types of linear models.

Why Use Embedded Methods?

Common Embedded Methods

Let’s look at the most popular embedded methods used in machine learning.

1. **Lasso Regression (L1 Regularization)

Lasso stands for Least Absolute Shrinkage and Selection Operator. It is a type of linear regression that uses L1 regularization, which can shrink some feature weights to zero. When a feature’s weight becomes zero, the model ignores it.

Formula:

\text{Loss} = \text{MSE} + \lambda \sum_{j=1}^{n} |w_j|

Where:

When \lambda is high, more weights become zero.

Python Code Example:

from sklearn.linear_model import Lasso from sklearn.datasets import fetch_california_housing # Import California housing dataset from sklearn.model_selection import train_test_split import pandas as pd

california_housing = fetch_california_housing() # Load California housing dataset X = pd.DataFrame(california_housing.data, columns=california_housing.feature_names) y = california_housing.target # Use the target from the new dataset

Train-test split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

Fit Lasso model

model = Lasso(alpha=0.1) model.fit(X_train, y_train)

Check selected features

selected_features = X.columns[model.coef_ != 0] print("Selected Features:", selected_features.tolist())

`

**Output

Screenshot-from-2025-05-29-11-07-49

Output

2. **Ridge vs Lasso vs ElasticNet

ElasticNet Formula:

\text{Loss} = \text{MSE} + \lambda_1 \sum |w_j| + \lambda_2 \sum w_j^2

Useful when there are many correlated features.

3. **Decision Trees and Tree-Based Models

Tree-based models like Decision Trees, Random Forests, and Gradient Boosting automatically rank features by importance.

How It Works:

4. **Regularized Logistic Regression

Just like Lasso works for linear regression, it also works for classification using logistic regression.

Formula:

\text{Loss} = -\text{log-likelihood} + \lambda \sum |w_j|

Used for binary classification with automatic feature selection.

5. **Support Vector Machine (SVM) with L1 Penalty

SVMs can also be used with L1 regularization to remove irrelevant features. This is called L1-SVM. It's more advanced but helpful when features are many and irrelevant ones need to be removed.

Advantages of Embedded Methods

Limitations of Embedded Methods