Linear Regression

Linear Regression - main article about univariative linear regression

Multiple Features

  • suppose we have several features
  • and we want to use them all to predict $Y$


For example, we want to predict a house's price

  • $y$ - price (dependent)
  • $x_1$ - # of bedrooms
  • $x_2$ - # of floors
  • $x_3$ - age
  • $x_4$ - size


let's use the following notation

  • $n$ - number of features
  • $m$ - number of examples
  • $x^{(i)}$ - a vector all features of the $i$th training set, $x^{(i)} \in \mathbb{R}$
  • $x_j^{(i)}$ - $j$th element of $i$th training example

e.g.

$x^{(2)} = (x_1^{(2)}, x_2^{(2)}, x_3^{(2)}, x_4^{(2)})$ - vector of all features from the second row


$h_{\theta}(x) = \theta_0 + \theta_1 x$
  • now we have
$h_{\theta}(x) = \theta_0 + \theta_1 x_1 + ... + \theta_n x_n$


let $x_0 = 1$ (i.e. all $x_0^{(i)} = 1$) - so-called zeroth feature - always 1 (our slope)


So now we can view $x^{(i)}$ as $n+1$ vector: $x^{(i)} \in \mathbb{R}^{n + 1}$, $x = \left[ \begin{matrix} x_0 \\ \vdots \\ x_n \end{matrix} \right]$ and $\theta = \left[ \begin{matrix} \theta_0 \\ \vdots \\ \theta_n \end{matrix} \right] \in \mathbb{R}^{n+1}$

And $h_{\theta}(x) = \theta_0 x_0 + \theta_1 x_1 + ... + \theta_n x_n = \theta^{T} x$ (which is $[\theta_0 ... \theta_n] \cdot \left[ \begin{matrix} x_0 \\ \vdots \\ x_n \end{matrix} \right]$ )

This is called multivariate linear regression


Polynomial Regression

  • Suppose we want to fit not just features, but their combinations
  • For example, we have two features: height and width, and we want to use them both to fit one parameter $\theta$
    • So we write:
    • $h(x) = \theta_0 + \theta_1 x = \theta_0 + \theta_1 \cdot \text{height} \cdot \text{width}$
    • ($x = \text{height} \cdot \text{width}$)

Next, suppose we have the following relationship between data

regression-poly.png
  • we may try to use
  • $\theta_0 + \theta_1 x + \theta_2 x^2 $
  • or even
  • $\theta_0 + \theta_1 x + \theta_2 x^2 + \theta_3 x^3 $
  • So we have 3 features instead of one: $x$, $x^2$ and $x^3$!
  • Don't forget to normalize them - it's important because all these features have different scales


Computing Coefficients

Gradient Descent for Multivariate Linear Regression

Main Article: Gradient Descent#Multivariate Linear Regression

Normal Equation

Main Article: Normal Equation

This is another way of computing coefficients for multivariate regression


Linear Regression Assumptions


Sources

Share your opinion