# ML Wiki

## Confidence Intervals

In Inferential Statistics we estimate a parameter of the population based on sample

• Point Estimate is just one single plausible value
• it's a good idea to expand it a bit and build a confidence interval around the point estimate
• and use Standard Error as a measure of uncertainty in the Point Estimate to find this interval

Main idea - the CI should include the real parameter

### Confidence Level

The degree of confidence at which we're sure the interval will span the true parameter is Confidence level

• e.g. 95% confidence interval contains the estimated parameter with probability 0.95 - i.e. in 1 case out of 20 it will miss the real parameter

The idea of Sampling Distribution is important here

• we use it to calculate percentiles of the possible values, if the SD was centered at our point estimate
• so the SI should span the true value

Example

• we want to estimate the mean
• suppose we happen to know the sampling distribution: it's $N(\mu = 10, \sigma = 3.3)$
• it's centered around the proportion mean $\mu$
• and the Standard Error is 3.3
• we draw a Point Estimate from the sampling distribution
• we get $\bar{X} = 5.5$
• Assuming that the SD is centered around 5.5, we compute 95% CI
• $z$-value is 1.96, so the interval is (-0.97 11.97)
• it includes the true value $\mu=10$ R code
x = seq(-10, 25, 0.3)
m = 10
se = 3.3

plot(x, dnorm(x, mean=m, sd=se), type='l', bty='n', lty=2, ylab='')
abline(v=m, lty=2)

m.observed = 5.5
abline(v=m.observed, col='red')
dy = dnorm(x, mean=5.5, sd=se)
lines(x, y=dy, col='red')

lo = m.observed - 1.96 * se
hi = m.observed + 1.96 * se
c(lo, hi)

x1 = min(which(x >= lo)); x2 = max(which(x <= hi))

polygon(x[c(x1, x1:x2, x2)],
c(0, dy[x1:x2], 0), col=adjustcolor('red', 0.4), border=NA)

par(xpd=NA)

text(m, 0.13, m)
text(m.observed, 0.13, m.observed)

arrows(x0=lo, y0=0.02, x1=hi, y1=0.02, code=3, length=0.15)
text(m.observed, 0.02-0.005, 'confidence interval', cex=0.7)

par(xpd=FALSE)


A confidence interval consists of two parts

• left part - lower bound
• right part - upper bound

"95% confident" means that if we took many many samples from the SD and build a CI from each, then about 95% of these CIs should contain the actual parameter being estimated (e.g. $p$ for binom, $\mu$ for mean) So we see indeed that sometimes the CI doesn't include the true value but we're 95% confident that a CI calculated from one sample will include it

R code to produce the figure
load(url('http://s3.amazonaws.com/assets.datacamp.com/course/dasi/ames.RData'))

### Relationship with Hypothesis Testing

Main Article: Confidence Intervals and Statistical Tests