Explain sum of squares.

 Certainly! The Sum of Squares (SS) is a fundamental concept in statistics, especially in regression and ANOVA. It measures variability—that is, how much values differ from a mean or predicted value.


🔍 What Is Sum of Squares?

Sum of Squares quantifies the total variation in a dataset by summing the squared deviations from a reference value (usually the mean or predicted value).

The general form is:

SS=(XiReference Value)2\text{SS} = \sum (X_i - \text{Reference Value})^2

Why squared?

  • To avoid cancellation of positive and negative deviations

  • To give more weight to larger deviations


🎯 Types of Sum of Squares in ANOVA/Regression

1. Total Sum of Squares (SST)

  • Measures the total variation in the dependent variable YY

  • Compares each actual value to the grand mean Yˉ\bar{Y}

SST=(YiYˉ)2SST = \sum (Y_i - \bar{Y})^2

2. Explained Sum of Squares (SSR) — also called Regression SS or Between-Group SS

  • Measures the variation explained by the model

  • Compares the predicted value Y^i\hat{Y}_i to the mean Yˉ\bar{Y}

SSR=(Y^iYˉ)2SSR = \sum (\hat{Y}_i - \bar{Y})^2

3. Residual Sum of Squares (SSE) — also called Error SS or Within-Group SS

  • Measures the unexplained variation — how far actual values deviate from predicted values

SSE=(YiY^i)2SSE = \sum (Y_i - \hat{Y}_i)^2

 The Relationship:

SST=SSR+SSESST = SSR + SSE
TermNameWhat it Measures
SSTTotal SSTotal variability in Y
SSRRegression SSVariability explained by model
SSEError SSVariability not explained by model

📊 Finance Example:

Let’s say you are studying how advertising spend (X) affects sales (Y):

ObsX (₹'000)Y (Sales ₹'000)Predicted Y (Ŷ)
120220225
230270260
340310295
  • Yˉ=266.67\bar{Y} = 266.67

Now compute:

  • SST = (YYˉ)2\sum (Y - \bar{Y})^2

  • SSR = (Y^Yˉ)2\sum (\hat{Y} - \bar{Y})^2

  • SSE = (YY^)2\sum (Y - \hat{Y})^2

These tell us:

  • How much variation exists in sales (SST),

  • How much the model explains (SSR),

  • How much is left as error (SSE).


🔁 Summary:

Sum of SquaresFormulaInterpretation
SST(YYˉ)2\sum (Y - \bar{Y})^2
Total variation in outcome
SSR(Y^Yˉ)2Y^Yˉ)2Explained by regression model
SSE(YY^)2\sum (Y - \hat{Y})^2
Not explained (error/residual)

The Sum of Squares is the foundation of R², F-statistics, and all significance testing in ANOVA and regression.

Comments

Popular posts from this blog

Research Methodology vs Research methods

Types of variables in Finance Research