What is Degrees of Freedom?
Great question! Degrees of Freedom (df) is a core concept in statistics, especially in regression, ANOVA, and hypothesis testing.
π― What is Degrees of Freedom (df)?
Degrees of Freedom represent the number of independent values in a calculation that are free to vary after certain restrictions (like means) have been applied.
In simple terms:
-
It's the number of values in a dataset that can change without violating a given constraint.
π§ Why It Matters
Degrees of freedom are used to:
-
Calculate variances and standard errors
-
Determine the critical values in t-tests, F-tests, and chi-square tests
-
Assess the validity of models (like in regression or ANOVA)
π Degrees of Freedom in Common Contexts
1. Sample Variance
When calculating variance:
-
The mean uses up 1 degree of freedom.
-
So the df = n - 1 (where is the number of observations)
2. ANOVA
Source | df | Explanation |
---|---|---|
Between Groups (SSB) | k = number of groups | |
Within Groups (SSW) | n = total observations | |
Total | One less than total observations |
3. Regression Analysis
Component | df | Explanation |
---|---|---|
Regression (SSR) | k = number of independent variables | |
Residual (SSE) | n = total data points | |
Total (SST) | Always total observations minus 1 |
π― Conceptual Example (Finance)
Suppose you are analyzing sales data (Y) based on advertising (X) across 6 companies.
-
You estimate 1 intercept + 1 slope → 2 estimated values.
-
You had 6 data points (n = 6)
Then:
-
Total df =
-
Regression df = number of predictors = 1
-
Residual df =
These df are used to compute the Mean Squares in the ANOVA table and to evaluate the F-statistic.
π§Ύ Summary
Type | Formula | What It Represents |
---|---|---|
Sample Mean | One value (mean) used, others free to vary | |
ANOVA (Between) | Variation across groups | |
ANOVA (Within) | Variation within groups | |
Regression | Predictors used | |
Residual (Error) | What's left after accounting for predictors |
Think of degrees of freedom as the "budget of flexibility" you have in estimating parameters from data — every time you estimate a parameter (like a mean or regression coefficient), you "spend" one degree of freedom.
Comments
Post a Comment