Explain sum of squares.
Certainly! The Sum of Squares (SS) is a fundamental concept in statistics, especially in regression and ANOVA. It measures variability—that is, how much values differ from a mean or predicted value.
🔍 What Is Sum of Squares?
Sum of Squares quantifies the total variation in a dataset by summing the squared deviations from a reference value (usually the mean or predicted value).
The general form is:
Why squared?
-
To avoid cancellation of positive and negative deviations
-
To give more weight to larger deviations
🎯 Types of Sum of Squares in ANOVA/Regression
1. Total Sum of Squares (SST)
-
Measures the total variation in the dependent variable
-
Compares each actual value to the grand mean
2. Explained Sum of Squares (SSR) — also called Regression SS or Between-Group SS
-
Measures the variation explained by the model
-
Compares the predicted value to the mean
3. Residual Sum of Squares (SSE) — also called Error SS or Within-Group SS
-
Measures the unexplained variation — how far actual values deviate from predicted values
The Relationship:
Term | Name | What it Measures |
---|---|---|
SST | Total SS | Total variability in Y |
SSR | Regression SS | Variability explained by model |
SSE | Error SS | Variability not explained by model |
📊 Finance Example:
Let’s say you are studying how advertising spend (X) affects sales (Y):
Obs | X (₹'000) | Y (Sales ₹'000) | Predicted Y (Ŷ) |
---|---|---|---|
1 | 20 | 220 | 225 |
2 | 30 | 270 | 260 |
3 | 40 | 310 | 295 |
Now compute:
-
SST =
-
SSR =
-
SSE =
These tell us:
-
How much variation exists in sales (SST),
-
How much the model explains (SSR),
-
How much is left as error (SSE).
🔁 Summary:
Sum of Squares | Formula | Interpretation |
---|---|---|
SST | Total variation in outcome | |
SSR | Explained by regression model | |
SSE | Not explained (error/residual) |
The Sum of Squares is the foundation of R², F-statistics, and all significance testing in ANOVA and regression.
Comments
Post a Comment