http://mlwiki.org/index.php?title=OLS_Regression&feed=atom&action=history
OLS Regression - Revision history
2024-03-29T09:04:35Z
Revision history for this page on the wiki
MediaWiki 1.25.3
http://mlwiki.org/index.php?title=OLS_Regression&diff=796&oldid=prev
Alexey: Alexey moved page Ordinary Least Squares to OLS Regression over redirect
2017-06-27T12:16:42Z
<p>Alexey moved page <a href="/index.php/Ordinary_Least_Squares" class="mw-redirect" title="Ordinary Least Squares">Ordinary Least Squares</a> to <a href="/index.php/OLS_Regression" title="OLS Regression">OLS Regression</a> over redirect</p>
<table class='diff diff-contentalign-left'>
<tr style='vertical-align: top;'>
<td colspan='1' style="background-color: white; color:black; text-align: center;">← Older revision</td>
<td colspan='1' style="background-color: white; color:black; text-align: center;">Revision as of 12:16, 27 June 2017</td>
</tr><tr><td colspan='2' style='text-align: center;'><div class="mw-diff-empty">(No difference)</div>
</td></tr></table>
Alexey
http://mlwiki.org/index.php?title=OLS_Regression&diff=794&oldid=prev
Alexey at 12:06, 27 June 2017
2017-06-27T12:06:30Z
<p></p>
<table class='diff diff-contentalign-left'>
<col class='diff-marker' />
<col class='diff-content' />
<col class='diff-marker' />
<col class='diff-content' />
<tr style='vertical-align: top;'>
<td colspan='2' style="background-color: white; color:black; text-align: center;">← Older revision</td>
<td colspan='2' style="background-color: white; color:black; text-align: center;">Revision as of 12:06, 27 June 2017</td>
</tr><tr><td colspan="2" class="diff-lineno" id="L11" >Line 11:</td>
<td colspan="2" class="diff-lineno">Line 11:</td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* $n$ features, $\mathbf x_i = \big[x_{i1}, \ ... \ , x_{in} \big]^T \in \mathbb{R}^n$</div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* $n$ features, $\mathbf x_i = \big[x_{i1}, \ ... \ , x_{in} \big]^T \in \mathbb{R}^n$</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* We can put all such $\mathbf x_i$ as rows of a matrix $X$ (sometimes called a ''design matrix'')</div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* We can put all such $\mathbf x_i$ as rows of a matrix $X$ (sometimes called a ''design matrix'')</div></td></tr>
<tr><td class='diff-marker'>−</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>* <del class="diffchange diffchange-inline">$</del>X = \begin{bmatrix}</div></td><td class='diff-marker'>+</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>* <ins class="diffchange diffchange-inline"><math></ins>X = \begin{bmatrix}</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>- \ \mathbf x_1^T - \\  </div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>- \ \mathbf x_1^T - \\  </div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>   \vdots  \\  </div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>   \vdots  \\  </div></td></tr>
<tr><td colspan="2" class="diff-lineno" id="L19" >Line 19:</td>
<td colspan="2" class="diff-lineno">Line 19:</td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>  &  \ddots &  \\  </div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>  &  \ddots &  \\  </div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>x_{m1} & \cdots & x_{mn}  \\  </div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>x_{m1} & \cdots & x_{mn}  \\  </div></td></tr>
<tr><td class='diff-marker'>−</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>\end{bmatrix}<del class="diffchange diffchange-inline">$</del></div></td><td class='diff-marker'>+</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>\end{bmatrix}<ins class="diffchange diffchange-inline"></math></ins></div></td></tr>
<tr><td class='diff-marker'>−</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>* the observed values: <del class="diffchange diffchange-inline">$</del>\mathbf y = \begin{bmatrix}</div></td><td class='diff-marker'>+</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>* the observed values: <ins class="diffchange diffchange-inline"><math></ins>\mathbf y = \begin{bmatrix}</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>y_1 \\ \vdots \\ y_m</div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>y_1 \\ \vdots \\ y_m</div></td></tr>
<tr><td class='diff-marker'>−</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>\end{bmatrix} \in \mathbb{R}^{m}<del class="diffchange diffchange-inline">$</del></div></td><td class='diff-marker'>+</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>\end{bmatrix} \in \mathbb{R}^{m}<ins class="diffchange diffchange-inline"></math></ins></div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* Thus, we expressed our problem in the matrix form: $X \mathbf w = \mathbf y$</div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* Thus, we expressed our problem in the matrix form: $X \mathbf w = \mathbf y$</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* Note that there's usually additional feature $x_{i0} = 1$ - the slope,  </div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* Note that there's usually additional feature $x_{i0} = 1$ - the slope,  </div></td></tr>
<tr><td class='diff-marker'>−</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>** so $\mathbf x_i \in \mathbb{R}^{n+1}$ and <del class="diffchange diffchange-inline">$</del>X = \begin{bmatrix}</div></td><td class='diff-marker'>+</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>** so $\mathbf x_i \in \mathbb{R}^{n+1}$ and <ins class="diffchange diffchange-inline"><math></ins>X = \begin{bmatrix}</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>- \ \mathbf x_1^T - \\  </div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>- \ \mathbf x_1^T - \\  </div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>- \ \mathbf x_2^T - \\  </div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>- \ \mathbf x_2^T - \\  </div></td></tr>
<tr><td colspan="2" class="diff-lineno" id="L35" >Line 35:</td>
<td colspan="2" class="diff-lineno">Line 35:</td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>  & &  \ddots &  \\  </div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>  & &  \ddots &  \\  </div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>x_{m0} & x_{m1} & \cdots & x_{mn}  \\  </div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>x_{m0} & x_{m1} & \cdots & x_{mn}  \\  </div></td></tr>
<tr><td class='diff-marker'>−</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>\end{bmatrix} \in \mathbb R^{m \times n + 1}<del class="diffchange diffchange-inline">$</del></div></td><td class='diff-marker'>+</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>\end{bmatrix} \in \mathbb R^{m \times n + 1}<ins class="diffchange diffchange-inline"></math></ins></div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"></td></tr>
<tr><td colspan="2" class="diff-lineno" id="L77" >Line 77:</td>
<td colspan="2" class="diff-lineno">Line 77:</td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>Suppose we have the following dataset:  </div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>Suppose we have the following dataset:  </div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* ${\cal D} = \{ (1,1), (2,2), (3,2) \}$</div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* ${\cal D} = \{ (1,1), (2,2), (3,2) \}$</div></td></tr>
<tr><td class='diff-marker'>−</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>* the matrix form is <del class="diffchange diffchange-inline">$</del>\begin{bmatrix}</div></td><td class='diff-marker'>+</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>* the matrix form is <ins class="diffchange diffchange-inline"><math></ins>\begin{bmatrix}</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>1 & 1\\  </div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>1 & 1\\  </div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>1 & 2\\  </div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>1 & 2\\  </div></td></tr>
<tr><td colspan="2" class="diff-lineno" id="L87" >Line 87:</td>
<td colspan="2" class="diff-lineno">Line 87:</td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>\begin{bmatrix}  </div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>\begin{bmatrix}  </div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>1 \\ 2 \\ 2</div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>1 \\ 2 \\ 2</div></td></tr>
<tr><td class='diff-marker'>−</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>\end{bmatrix}<del class="diffchange diffchange-inline">$</del></div></td><td class='diff-marker'>+</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>\end{bmatrix}<ins class="diffchange diffchange-inline"></math></ins></div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* no line goes through these points at once</div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* no line goes through these points at once</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* so we solve $X^T X \mathbf{\hat w} = X^T \mathbf y$  </div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* so we solve $X^T X \mathbf{\hat w} = X^T \mathbf y$  </div></td></tr>
<tr><td class='diff-marker'>−</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>* <del class="diffchange diffchange-inline">$</del>\begin{bmatrix}</div></td><td class='diff-marker'>+</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>* <ins class="diffchange diffchange-inline"><math></ins>\begin{bmatrix}</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>1 & 1 & 1 \\  </div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>1 & 1 & 1 \\  </div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>1 & 2 & 3 \\  </div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>1 & 2 & 3 \\  </div></td></tr>
<tr><td colspan="2" class="diff-lineno" id="L100" >Line 100:</td>
<td colspan="2" class="diff-lineno">Line 100:</td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>3 & 6\\  </div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>3 & 6\\  </div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>6 & 14\\</div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>6 & 14\\</div></td></tr>
<tr><td class='diff-marker'>−</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>\end{bmatrix}<del class="diffchange diffchange-inline">$</del></div></td><td class='diff-marker'>+</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>\end{bmatrix}<ins class="diffchange diffchange-inline"></math></ins></div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* this system is invertible, so we solve it and get $\hat w_0 = 2/3, \hat w_1 = 1/2$</div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* this system is invertible, so we solve it and get $\hat w_0 = 2/3, \hat w_1 = 1/2$</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* thus the best line is $h(t) = w_0 + w_1 t = 2/3 + 1/2 t$</div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>* thus the best line is $h(t) = w_0 + w_1 t = 2/3 + 1/2 t$</div></td></tr>
</table>
Alexey
http://mlwiki.org/index.php?title=OLS_Regression&diff=792&oldid=prev
Alexey: Alexey moved page OLS Regression to Ordinary Least Squares
2017-06-27T12:03:43Z
<p>Alexey moved page <a href="/index.php/OLS_Regression" title="OLS Regression">OLS Regression</a> to <a href="/index.php/Ordinary_Least_Squares" class="mw-redirect" title="Ordinary Least Squares">Ordinary Least Squares</a></p>
<table class='diff diff-contentalign-left'>
<tr style='vertical-align: top;'>
<td colspan='1' style="background-color: white; color:black; text-align: center;">← Older revision</td>
<td colspan='1' style="background-color: white; color:black; text-align: center;">Revision as of 12:03, 27 June 2017</td>
</tr><tr><td colspan='2' style='text-align: center;'><div class="mw-diff-empty">(No difference)</div>
</td></tr></table>
Alexey
http://mlwiki.org/index.php?title=OLS_Regression&diff=559&oldid=prev
Alexey at 19:31, 25 April 2015
2015-04-25T19:31:55Z
<p></p>
<p><b>New page</b></p><div>== Ordinary Least Squares Regression ==<br />
* This is a technique for computing coefficients for [[Multivariate Linear Regression]].<br />
* the solution is obtained via minimizing the squared error, therefore it's called ''Linear Least Squares''<br />
* two solutions: [[Normal Equation]] and [[Gradient Descent]]<br />
* this is the the typical way of solving the [[Multivariate Linear Regression]], therefore it's often called '''OLS Regression'''<br />
<br />
<br />
== Regression Problem ==<br />
Suppose we have<br />
* $m$ training examples $(\mathbf x_i, y_i)$<br />
* $n$ features, $\mathbf x_i = \big[x_{i1}, \ ... \ , x_{in} \big]^T \in \mathbb{R}^n$<br />
* We can put all such $\mathbf x_i$ as rows of a matrix $X$ (sometimes called a ''design matrix'')<br />
* $X = \begin{bmatrix}<br />
- \ \mathbf x_1^T - \\ <br />
\vdots \\ <br />
- \ \mathbf x_m^T - \\ <br />
\end{bmatrix} = \begin{bmatrix}<br />
x_{11} & \cdots & x_{1n} \\ <br />
& \ddots & \\ <br />
x_{m1} & \cdots & x_{mn} \\ <br />
\end{bmatrix}$<br />
* the observed values: $\mathbf y = \begin{bmatrix}<br />
y_1 \\ \vdots \\ y_m<br />
\end{bmatrix} \in \mathbb{R}^{m}$<br />
* Thus, we expressed our problem in the matrix form: $X \mathbf w = \mathbf y$<br />
* Note that there's usually additional feature $x_{i0} = 1$ - the slope, <br />
** so $\mathbf x_i \in \mathbb{R}^{n+1}$ and $X = \begin{bmatrix}<br />
- \ \mathbf x_1^T - \\ <br />
- \ \mathbf x_2^T - \\ <br />
\vdots \\ <br />
- \ \mathbf x_m^T - \\ <br />
\end{bmatrix} = \begin{bmatrix}<br />
x_{10} & x_{11} & \cdots & x_{1n} \\ <br />
x_{20} & x_{21} & \cdots & x_{2n} \\ <br />
& & \ddots & \\ <br />
x_{m0} & x_{m1} & \cdots & x_{mn} \\ <br />
\end{bmatrix} \in \mathbb R^{m \times n + 1}$<br />
<br />
<br />
Thus we have a system <br />
* $X \mathbf w = \mathbf y$<br />
* how do we solve it, and if there's no solution, how do we find the best possible $\mathbf w$?<br />
<br />
<br />
<br />
== Least Squares ==<br />
=== Normal Equation ===<br />
There's no solution to the system, so we try to fit the data as good as possible <br />
* Let $\mathbf w$ be the best fit solution to $X \mathbf w \approx \mathbf y$<br />
* we'll try to minimize the error $\mathbf e = \mathbf y - X \mathbf w$ (also called [[Residual Analysis|residuals]])<br />
* we take the square of this error, so the objective is <br />
* $J(\mathbf w) = \| \mathbf e \|^2 = \| \mathbf y - X \mathbf w \|^2$<br />
<br />
<br />
The solution:<br />
* $\mathbf w = (X^T X)^{-1} X^T \mathbf y = X^+ \mathbf y$ <br />
* where $X^+ = (X^T X)^{-1} X^T$ is the [[General Inverse|Pseudoinverse]] of $X$<br />
<br />
<br />
From the [[Linear Algebra]] point of view:<br />
* we need to solve $X \mathbf w = \mathbf y$<br />
* if $\mathbf y \not \in C(X)$ ([[Column Space]]) then there's no solution<br />
* How to solve it approximately? [[Projection onto Subspaces|Project]] on $C(A)$!<br />
* again, it gives us the [[Normal Equation]]: $X^T X \mathbf w = X^T \mathbf y$<br />
<br />
<br />
=== [[Gradient Descent]] ===<br />
Alternatively, we can use Gradient Descent:<br />
* objective is $J(\mathbf w) = \| \mathbf y - X \mathbf w \|^2$<br />
* the derivative w.r.t. $\mathbf w$ is $\cfrac{\partial J(\mathbf w)}{\partial \mathbf w} = 2 X^T X \mathbf w - 2 X^T \mathbf y$<br />
* so the update rule is $\mathbf w \leftarrow \mathbf w - \alpha 2 (X^T X \mathbf w - X^T \mathbf y)$<br />
* where $\alpha$ is the learning rate<br />
<br />
<br />
<br />
== Example ==<br />
Suppose we have the following dataset: <br />
* ${\cal D} = \{ (1,1), (2,2), (3,2) \}$<br />
* the matrix form is $\begin{bmatrix}<br />
1 & 1\\ <br />
1 & 2\\ <br />
1 & 3\\<br />
\end{bmatrix}<br />
\begin{bmatrix}<br />
w_0 \\ w_1<br />
\end{bmatrix} = <br />
\begin{bmatrix} <br />
1 \\ 2 \\ 2<br />
\end{bmatrix}$<br />
* no line goes through these points at once<br />
* so we solve $X^T X \mathbf{\hat w} = X^T \mathbf y$ <br />
* $\begin{bmatrix}<br />
1 & 1 & 1 \\ <br />
1 & 2 & 3 \\ <br />
\end{bmatrix} \begin{bmatrix}<br />
1 & 1\\ <br />
1 & 2\\ <br />
1 & 3\\<br />
\end{bmatrix} = \begin{bmatrix}<br />
3 & 6\\ <br />
6 & 14\\<br />
\end{bmatrix}$<br />
* this system is invertible, so we solve it and get $\hat w_0 = 2/3, \hat w_1 = 1/2$<br />
* thus the best line is $h(t) = w_0 + w_1 t = 2/3 + 1/2 t$<br />
<br />
<br />
http://habrastorage.org/files/ae0/b63/5a2/ae0b635a2e81493bb363d898b0e6369c.png<br />
<br />
<br />
<br />
== Normal Equation vs [[Gradient Descent]] ==<br />
[[Gradient Descent]]:<br />
* need to choose learning rate $\alpha$<br />
* need to do many iterations<br />
* works well with large $n$<br />
<br />
<br />
[[Normal Equation]]:<br />
* don't need to choose $\alpha$<br />
* don't need to iterate - computed in one step<br />
* slow if $n$ is large $(n \geqslant 10^4)$<br />
* need to compute $(X^T X)^{-1}$ - very slow<br />
* if $(X^T X)$ is not-invertible - we have problems<br />
<br />
<br />
== See Also ==<br />
* [[Multivariate Linear Regression]]<br />
* [[Gradient Descent]]<br />
<br />
<br />
== Sources ==<br />
* [[Linear Algebra MIT 18.06 (OCW)]]<br />
* [[Machine Learning (coursera)]]<br />
* [[Seminar Hot Topics in Information Management IMSEM (TUB)]]<br />
* http://en.wikipedia.org/wiki/Linear_least_squares_%28mathematics%29<br />
<br />
<br />
[[Category:Machine Learning]]<br />
[[Category:Regression]]<br />
[[Category:Linear Algebra]]<br />
[[Category:Statistics]]</div>
Alexey