Our goal is to be able to estimate theoretical parameters with a data sample.

Example:

- we want to estimate the probability of getting heads in coin flipping experiment
- flip a coin 10 times,
- count number of heads

Experiment:

- Our parameter of interest is $p = p(\text{heads})$
- Data: result of 10 coin flips
- $\hat{p}$ - estimate of $p$

- $\hat{p} = \cfrac{\text{# of heads}}{\text{total # of flips}}$
- i.e. $\hat{p}$ is calculated from data

- if we repeat over and over again, each time we will probably have different estimates of $\hat{p}$
- so there is a
*variability*in the estimate - this is called
*sampling variability*, and it occurs because of the randomness in our data

The probability distribution of all the possible values of an estimator is it's *sampling distribution*.

In our coin flipping example

- a flip follows the Bernoulli Distribution with $p = 1/2$

- $X \sim \text{Bernoulli}(0.5)$

- and $E(X) = 0.5$

For the entire experiment:

- 10 coin flips = 10 Bernoulli experiments with outcomes $X_1, ..., X_{10}$
- so, $\hat{p} = \cfrac{X_1 + ... + X_{10}}{10} = \bar{X}$
- thus, $E(\hat{p}) = p$ since $E(X_i) = p$ and $E(\bar{X}) = \cfrac{10 p}{10} = p$
- and $\hat{p}$ is called
*unbiased estimator*

A statistic used to estimate a parameter is *unbiased* if the expected value of its sampling distribution is equal to the value of the parameter being estimated

- For one observation $X \sim \text{Bernoulli}(p)$, variance $\text{Var}(X)$ is:

- $\text{Var}(X) = \sum_{x} (x - E(X))^2 p(X) = (1 - p)^2 p + (0 - p)^2 (1 - p) = p - p^2 = p(1 - p)$

- For $n$ observations $X_1, ..., X_{n}$ with $\hat{p} = E(X)$

- since $\text{Var}(\bar{X}) = \cfrac{\sum X_i}{n}$,
- $\text{Var}(\hat{p}) = \cfrac{p(1 - p)}{n}$ and $\text{sd}(\hat{p}) = \sqrt{\cfrac{p(1-p)}{n}}$,

So we get more and more precise answers over time

And by the Central Limit Theorem, for large $n$ the sampling distribution is approximately

$N\left(p, \cfrac{p(1-p)}{n}\right)$

In the Normal Distribution we have $N(\mu, \sigma^2)$, and we're interested in $\mu$

- Say we have $n$ data values $X_1, ..., X_n$ from independent observations
- Estimator of $\mu$ is $\bar{X} = \cfrac{X_1 + ... + X_n}{n}$
- So $E(\bar{X}) = \mu$, and $\bar{X}$ - unbiased estimator of $\mu$
- Variance of $\bar{X}$ is $\text{Var}(\bar{X}) = \cfrac{\sigma^2}{n}$ and $\text{sd}(\bar{X}) = \cfrac{\sigma}{\sqrt{n}}$
- And by the Central Limit Theorem we have $\bar{X} \sim N(\mu, \cfrac{\sigma^2}{n})$

So,

- distribution of $\hat{p} \sim N\left(p, \cfrac{p(1-p)}{n}\right)$
- distribution of $\bar{X} \sim N\left(\mu, \cfrac{\sigma^2}{n}\right)$

For data, unbiased variance is

- $\text{Var}(X) = \cfrac{1}{n-1} \sum (X_i - \bar{X})^2$ (unbiased)
- not $\text{Var}(X) = \cfrac{1}{n} \sum (X_i - \bar{X})^2$ (biased)