Residual Sum of Squares (original) (raw)

Last Updated : 7 Nov, 2024

Residual Sum of Squares is essentially the sum of the squared differences between the actual values of the dependent variable and the values predicted by the model. This metric provides a numerical representation of how well the model fits the data, with smaller values indicating a better fit and larger values suggesting a poorer fit.

**For example, for predicting retail store sales based on advertising spend using a linear regression model. Calculate the Residual Sum of Squares (RSS) by finding the squared differences between actual and predicted sales to assess model fit.

Residual-Sum-of-Squares

Residual Sum of Squares

The scatter plot on the right displays the residuals, which are the differences between actual sales and predicted sales, plotted against advertising spend.

**Ideally, we want these residuals to be randomly scattered around the horizontal zero line. If they are, it indicates that our model fits the data well.

However, in this case, we can see some patterns in the residuals, which suggests that our model may not be capturing all the underlying relationships in the data. This could mean we need a more complex model to better understand the relationship between advertising and sales.

Types of Sum of Squares

In regression analysis, RSS is one of the three main types of sum of squares, alongside the Total Sum of Squares (TSS) and the Sum of Squares due to Regression (SSR) or Explained Sum of Squares (ESS).

How to Calculate the Residual Sum of Squares?

Residual Sum of Squares (RSS) can be calculated using the following formula:

{RSS= \Sigma_{i=1}^n(y_i-f(x_i))^2}

Where,

Regression Sum of Squares (SSR)

The regression sum of squares measures how well the model is and how close is the predicted value to the expected value.

Consider a set X with n observations. The sum of squares S for this set can be calculated using the below formula:

\bold{S = \Sigma_{i=1}^{n} (X_i- \bar{X})^2}

Where,

Total Sum of Squares (TSS)

Total sum of squares is used to denote the amount of variation in the dependent variable. The total sum of squares is the sum of the regression sum of squares and the residual sum of squares. It is calculated as:

**TSS = RSS + SSR

Where the abbreviations have their usual meaning.

How to Calculate Sum of Squares?

We will discuss steps to calculate the sum of squares for both the residual method and regressive method in the following headings.

How to Calculate Residual Sum of Squares?

To calculate the residual sum of squares, we can use the following steps:

**Step 1: Organize the data to find the expected value.

**Step 2: Calculate the residual i.e., yi - ŷi.

**Step 3: Use the following formula to calculate the Residual Sum of Squares.

\bold{RSS= \Sigma_{i=1}^n(y_i-f(x_i))^2}

**Step 4: The result is the required value of the Residual Sum of Squares.

How to Calculate Sum of Squares Due to Regression?

To calculate the sum of squares due to regression we can use the following steps:

Significance and Limitations

**Significance of Sum of Squares

The sum of squares formula can be used for various purposes and has great significance in real life such as:

**Limitations of Sum of Squares

The sum of squares has the following limitations:

Solved Examples of Residual Sum of Squares

**Problem 1: Calculate the sum of squares of the set X = [1,2,3,6] if the mean is found to be 3.

**Solution:

Given \bar{X} = 3

X X-\bar{X}
1 -2
2 -1
3 0
6 3

Using S = \Sigma_{i=1}^{n} (X_i- \bar{X})^2

S = (-2)^2+(-1)^2+0^2+3^2

S = 4+1+0+9

S = 14

Therefore , The sum of squares of the set is 14.

**Problem 2: Calculate the sum of squares of the set X = [3,6,9,12,15] if the mean is found to be 9.

**Solution:

Given \bar{X} = 9

X X-\bar{X}
3 -6
6 -3
9 0
12 3
15 6

Using S = \Sigma_{i=1}^{n} (X_i- \bar{X})^2

S = (-6)^2+(-3)^2+0^2+3^2+6^2

S = 36+9+0+9+36

S = 90

\therefore The sum of squares of the set is 90.

**Problem 3: Calculate the sum of squares of the dataset X = [1,2,3,4,5,6]

**Solution:

In this case we need to calculate the mean first.

\bar{X} = \frac{1+2+3+4+5+6}{6}

= 21/6

\bar {X} = 3.5

X X-\bar{X}
1 -2.5
2 -1.5
3 -0.5
4 0.5
5 1.5
6 2.5

Using S = \Sigma_{i=1}^{n} (X_i- \bar{X})^2

S = (-2.5)^2+(-1.5)^2+(-0.5)^2+(0.5)^2+(1.5)^2+(2.5)^2

S = 6.25+2.25+0.25+0.25+2.25+6.25

S = 17.50

\therefore The sum of squares of the set is 17.50.

**Problem 4: Calculate the sum of squares of the dataset Y = [3,4,5,1,7]

**Solution:

In this case we need to calculate the mean first.

\bar{X} = \frac{3+4+5+1+7}{5}

= 20/5

\bar {X} = 4

X X-\bar{X}
3 -1
4 0
5 1
1 -3
7 3

Using S = \Sigma_{i=1}^{n} (X_i- \bar{X})^2

S = (-1)^2+(0)^2+(1)^2+(-3)^2+(3)^2

S = 1+0+1+9+9

S = 20

\therefore The sum of squares of the set is 20.

**Problem 5: Calculate the sum of squares of the set X = [1,4,6,8] if mean is found to be 4.75.

**Solution:

Given \bar{X} = 4.75

X X-\bar{X}
1 -3.75
4 -0.75
6 1.25
8 3.25

Using S = \Sigma_{i=1}^{n} (X_i- \bar{X})^2

S = (-3.75)^2+(-0.75)^2+(1.25)^2+(3.25)^2

S = 14.0625+0.5625+1.5625+10.5625

S = 26.75

\therefore The sum of squares of the set is 26.75.