Method of Least Squares
- Suppose we want to fit a regression line to our data
- We want to find the slope and intercept parameters
- And we want to find the best fit
Optimization
- To do that we minimize the sum of squares for differences:
- $\text{ss} = \sum_{i = 1}^{n} (y_i - b_0 - b_1 x_i)^2 $
- We want to make $\text{ss}$ as small as possible
We need to use calculus to do that
- we find partial derivatives to find the critical (minimal in our case) value
- $\cfrac{\partial \text{ss}}{\partial b_0} = \cfrac{\partial \text{ss}}{\partial b_1} = 0$
So after calculating that we get:
- $b_0 = \cfrac{1}{n} (\sum y_i - b_0 \sum x_i)$ for intercept
- $b_1 = \cfrac{n \sum x_i y_i - \sum x_i \sum y_i }{n \sum x_i^2 - (\sum x_i)^2}$ for slope
These values minimize the sum of square differences
Sources