Principal Component Analysis

Principal Component Analysis is the most popular and commonly used technique for Dimensionality Reduction

Suppose we want to reduce from 2D to 1D

  • dim-red-intuition.png
  • how to find the best projection line?

We want to find a line which would give us the smallest square distance from the data points to their projection

Before running PCA it's a good idea to perform Feature Scaling

  • so features have zero mean and
  • comparable ranges of values

To reduce from $N$-dim to $K$-dim

  • we find a direction (a vector $u^{(1)} \in \mathbb{R}^n$, say $n = 2$)
  • we project the data onto this direction
  • and we want the projection error to be as small as possible
  • doesn't matter if $u^{(1)}$ is

See also