Quants

Sum of Squared Errors (SSE)


By  Shubham Kumar
Updated On
Sum of Squared Errors (SSE)

When a regression model is estimated, it generates predicted values for the dependent variable. In theory, those predictions should follow the data. In practice, they rarely match every observation perfectly.

Some predicted values fall slightly above the actual data points. Others fall below them. The difference between the predicted value and the observed value is known as a residual.

One residual by itself does not say much. It only describes the error for a single observation. But when a dataset contains many observations, these small prediction gaps appear again and again. To understand how large these errors are overall, statisticians look at something called the Sum of Squared Errors, or SSE.


Why the Errors Are Squared

Residuals can be positive or negative. If the model predicts a value that is too high, the residual becomes negative. If the prediction is too low, the residual is positive.

If we simply added these errors together, many of them would cancel out. The final total might appear small even if the model is making large mistakes.

To prevent that from happening, each residual is squared before being added. Squaring removes the negative sign and ensures that larger errors carry more weight in the calculation.

Once every residual is squared and added together, the result is the Sum of Squared Errors.


What SSE Tells Us

SSE reflects how far the regression model’s predictions are from the observed data.

If SSE is large, the predicted values are often far from the actual observations. In that situation, the model may not be describing the relationship in the data particularly well.

If SSE is smaller, the predicted values tend to lie closer to the data points. The regression line is doing a better job of tracking the pattern in the dataset.


Most regression models in finance and econometrics are estimated using the ordinary least squares (OLS) method.

The central idea behind OLS is straightforward. Among all possible regression lines, the chosen one is the line that minimises the Sum of Squared Errors.

In other words, the method searches for the set of coefficients that produces the smallest total squared prediction error.


Final Thought

The Sum of Squared Errors measures the total size of the prediction mistakes produced by a regression model. By squaring the residuals and adding them together, SSE captures how far the model’s estimates are from the actual data points. Regression techniques such as OLS rely on this measure to determine the line that best fits the dataset.

No comments on this post so far :

Add your Thoughts: