Statistical Experiment

statistics

Statistical Experiments

There are two types of Data Collection

'’Response variable’’ (or ‘‘dependent variable’’) - the outcome of interest, measured on each subject or entry participating in the study
'’Explanatory variable’’ (or ‘‘predictor’’, or ‘‘independent variable’’) - a variable that we think may help to explain the value of the response variable.

'’Experiment’’ - when a researcher manipulates the explanatory variable to see the effect on the response.
So they ‘‘create’’ the data

with this type of studies it is possible to show the causal relationship between the variables

Example

Suppose we run a sunscreen study and collected some data
We saw that the more sunscreen is used, the more chances to have skin cancer
does sunscreen causes the cancer?
cannot say it here because the study is observational
e.g. in this case we don’t see the exposure to sun - it’s correlated with both sunscreen and cancer variables
- this is a Confounding Variable that is likely to have caused the effect
but if we do a randomized experiment, we can see if there’s any causal relationship

individuals are assigned to groups
researches assigns treatments to the groups
typically assignment is done at random - which is why it’s called “Randomized Experiments”

We want to see if there’s any causal relationship between the variables
so do the best to control any other difference in the group
- to make sure there’s nothing else that might interfere with the experiment (no Confounding Variables)
- e.g. the exposure to sun in the previous example

Example

Researchers sometimes may suspect that some variables (not only treatment) may influence the response
in such a case, group individuals into blocks and then randomize within the blocks
this way ensuring that there’s equal number of patients within each group

Example

To reduce the bias in the human experiments, split the patients into two groups:

but if a doctor knows that this patient is going to receive a placebo, it may impose some emotional effect on the doctors - it’s difficult to quantify
which is why both patients and the doctors are kept uninformed of what type of medicine they receive
it’s called a ‘‘double-blind setup’’

✏️ Edit on GitHub