Implementation of Lasso Regression From Scratch using Python (original) (raw)

Last Updated : 13 Apr, 2025

Lasso Regression (Least Absolute Shrinkage and Selection Operator) is a linear regression technique that combines prediction with feature selection. It does this by adding a penalty term to the cost function shrinking less relevant feature’s coefficients to zero. This makes it effective for high-dimensional datasets, reducing overfitting and performing automatic feature selection. In this article we will implement it in python. But before that lets have understand lasso regression.

Understanding Lasso Regression

Lasso Regression is another linear model derived from Linear Regression, sharing the same hypothetical function for prediction. The cost function of Linear Regression is represented by:

[Tex]J = \sum_{i=1}^{m} \left( y^{(i)} – h(x^{(i)}) \right)^2[/Tex]

Here

or Lasso Regression, the cost function is modified as follows by adding the L1 penalty term:

[Tex]J = \sum_{i=1}^{m} \left( y^{(i)} – h(x^{(i)}) \right)^2 + \lambda \sum_{j=1}^{n} |w_j|[/Tex]

Where:

Lasso Regression performs both variable selection and regularization by applying an L1 penalty to the coefficients. This encourages sparsity and reducing the number of features that contribute to the final model. Regularization is controlled by the hyperparameter λ.

The model aims to minimize both the sum of the squared errors and the sum of the absolute values of the coefficients. This dual optimization encourages sparsity in the model, leaving only the most important features.

Here’s how Lasso Regression operates:

  1. Set the intercept and coefficients to zero.
  2. Use gradient descent to iteratively update the coefficients.
  3. Enforce sparsity with the L1 penalty term, causing some coefficients to become exactly zero.
  4. The model simplifies as it focuses only on the most significant features, making it more interpretable.

Now we will implement it.

Implementation of Lasso Regression in Python

We will use a dataset containing “Years of Experience” and “Salary” for 2000 employees in a company. We will train a Lasso Regression model to learn the correlation between the number of years of experience of each employee and their respective salary. Once the model is trained we will be able to predict the salary of an employee based on their years of experience. You can download dataset from here.

1. Importing Libraries

We will be using numpy, pandas, scikit learn and matplotlib.

Python `

import numpy as np import pandas as pd from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler import matplotlib.pyplot as plt

`

2. Defining Lasso Regression Class

In this dataset Lasso Regression performs both **feature selection and **regularization. This means that Lasso will encourage sparsity by shrinking less important feature coefficients towards zero and effectively “pruning” irrelevant features. In the case of this dataset where the only feature is “years of experience” Lasso ensures that this feature is the most significant predictor of salary while any unnecessary noise is eliminated.

class LassoRegression(): def init(self, learning_rate, iterations, l1_penalty): self.learning_rate = learning_rate self.iterations = iterations self.l1_penalty = l1_penalty

def fit(self, X, Y):
    self.m, self.n = X.shape
    self.W = np.zeros(self.n)
    self.b = 0
    self.X = X
    self.Y = Y

    for i in range(self.iterations):
        self.update_weights()
    return self

def update_weights(self):
    Y_pred = self.predict(self.X)
  
    dW = np.zeros(self.n)
    for j in range(self.n):
        if self.W[j] > 0:
            dW[j] = (-2 * (self.X[:, j]).dot(self.Y - Y_pred) +
                     self.l1_penalty) / self.m
        else:
            dW[j] = (-2 * (self.X[:, j]).dot(self.Y - Y_pred) -
                     self.l1_penalty) / self.m

    db = -2 * np.sum(self.Y - Y_pred) / self.m

    self.W = self.W - self.learning_rate * dW
    self.b = self.b - self.learning_rate * db
    return self

def predict(self, X):
    return X.dot(self.W) + self.b

`

3. Training the model

def main(): df = pd.read_csv("Experience-Salary.csv") X = df.iloc[:, :-1].values Y = df.iloc[:, 1].values

scaler = StandardScaler()
X = scaler.fit_transform(X)

X_train, X_test, Y_train, Y_test = train_test_split(
    X, Y, test_size=1/3, random_state=0)

model = LassoRegression(
    iterations=1000, learning_rate=0.01, l1_penalty=500)
model.fit(X_train, Y_train)

Y_pred = model.predict(X_test)

print("Predicted values: ", np.round(Y_pred[:3], 2))
print("Real values:      ", Y_test[:3])
print("Trained W:        ", round(model.W[0], 2))
print("Trained b:        ", round(model.b, 2))

plt.scatter(X_test, Y_test, color='blue', label='Actual Data')
plt.plot(X_test, Y_pred, color='orange', label='Lasso Regression Line')
plt.title('Salary vs Experience (Lasso Regression)')
plt.xlabel('Years of Experience (Standardized)')
plt.ylabel('Salary')
plt.legend()
plt.show()

if name == "main": main()

`

**Output:

Screenshot-2025-03-26-161948

Trained Model

This output shows that the Lasso Regression model is successfully fitting the data with a clear linear relationship and it is capable of predicting salaries based on years of experience. The visualization and trained coefficients give insights into how well the model learned from the data. The close match between predicted and real values also shows the model’s ability to capture the underlying salary patterns effectively.

You can download source code from here.