A Probability plot is a technique for comparing two data sets
Commonly used:
It's a special case of Q-Q plots:
The normal probability plot is formed by:
Constructing
if the data is normally distributed, $z$-scores on the horizontal axis should approximately correspond to their percentiles
Evaluating the Normal Distribution (see [2])
load(url("http://www.openintro.org/stat/data/bdims.RData")) fdims = subset(bdims, bdims$sex == 0) qqnorm(fdims$hgt, col="orange", pch=19) qqline(fdims$hgt, lwd=2)
Does it look similar to real Normal Distribution?
set.seed(123) sim.norm = rnorm(n=length(fdims$hgt), mean=mean(fdims$hgt), sd=sd(fdims$hgt)) qqnorm(sim.norm, col="orange", pch=19, main="Normal Q-Q Plot of simulated data") qqline(sim.norm, lwd=2)
Can try to plot several simulations
qqnormsim = function(dat, dim=c(2,2)) { par(mfrow=dim) qqnorm(dat, main="Normal QQ Plot (Data)") qqline(dat) for (i in 1:(prod(dim) - 1)) { simnorm <- rnorm(n=length(dat), mean=mean(dat), sd=sd(dat)) qqnorm(simnorm, main = "Normal QQ Plot (Sim)") qqline(simnorm) } par(mfrow=c(1, 1)) } qqnormsim(fdims$hgt)
Looks like it's indeed normal
(Same data set as in example 1)
Let's take a look at another dataset
hist(fdims$wgt)
Looks a bit skewed
qqnorm(fdims$wgt, col="orange", pch=19) qqline(fdims$wgt, lwd=2)
qqnormsim(fdims$wgt)
Most likely not normal