Learning Curves
This is a good technique (a part of Machine Learning Diagnosis)
 to sanitycheck a model
 to improve performance
A learning curve is a plot where we have two functions of $m$ ($m$ is a set size):
 training set error $J_{\text{train}}(\theta)$,
 the crossvalidation error $J_{\text{cv}}(\theta)$
We can artificially reduce our training set size.
 We start from $m = 1$, then $m = 2$ and so on
So suppose we have the following model:
 $h_{\theta}(x) = \theta_0 + \theta_1 x + \theta_2 x^2$

 for each $m$ we calculate $J_{\text{train}}(\theta)$ and $J_{\text{cv}}(\theta)$ and plot the values

 This is the learning curve of the model
Diagnose High Bias (Underfitting)
Suppose we want to fit a straight line to out data:
 $h_{\theta}(x) = \theta_0 + \theta_1 x$
As $m$ increases we have pretty same line:
If we draw the learning curves, we'll have
So we see that
 as $m$ grows $J_{\text{cv}}(\theta) \to J_{\text{train}}(\theta)$
 and both errors are high
$\Rightarrow$
If learning algorithm is suffering from high bias, getting more examples will not help
Diagnose High Variance (Overfitting)
Now suppose we have a model with polynomial of very high order:
 $h_{\theta}(x) = \theta_0 + \theta_1 x + \theta_2 x^2 + ... + \theta_{100} x^{100}$
 at the beginning we very much overfit
 as we increase $m$, we still able to fit the data well
So we can see that as $m$ increases,
 $J_{\text{train}}(\theta)$ increases (we have more and more data  so it's harder and harder to fit $h_{\theta}(x)$), but it increases very slowly
 on the other hand, $J_{\text{cv}}(\theta)$ decreases, but also very very slow
 and there's a huge gap between these 2
 to fill that gap we need many many more training examples
$\Rightarrow$ if a learning algorithm is suffering from high variance (i.e. it overfits), getting more data is likely to help
See also
Sources