Line 3: | Line 3: | ||
=== [[Distributions]] === | === [[Distributions]] === | ||
{| class="wikitable" | {| class="wikitable" | ||
− | ! Name || [[ | + | ! Name || [[Random Number Generator|RNG]] || [[Probability Density Function|PDF]] || [[Cumulative Distribution Function|CDF]] |
|- | |- | ||
− | | || <code>rbeta</code> || <code>dbeta</code> | + | | [[Beta Distribution]] || <code>rbeta</code> || <code>dbeta</code> || <code>pbeta</code> |
|- | |- | ||
− | | [[Binomial Distribution]] || <code>rbinom</code> || <code>dbinom</code> | + | | [[Binomial Distribution]] || <code>rbinom</code> || <code>dbinom</code> || <code>pbinom</code> |
|- | |- | ||
− | | || <code>rcauchy</code> || <code>dcauchy</code> | + | | [[Cauchy Distribution]] || <code>rcauchy</code> || <code>dcauchy</code> || <code>pcauchy</code> |
|- | |- | ||
− | | || <code>rchisq</code> || <code>dchisq</code> | + | | [[Chi-Squared Distribution|$\chi^2$ Distribution]] || <code>rchisq</code> || <code>dchisq</code> || <code>pchisq</code> |
|- | |- | ||
− | | || <code>rexp</code> || <code>dexp</code> | + | | [[Exponential Distribution]] || <code>rexp</code> || <code>dexp</code> || <code>pexp</code> |
|- | |- | ||
− | | || <code>rf</code> || <code>df</code> | + | | [[F Distribution]] || <code>rf</code> || <code>df</code> || <code>pf</code> |
|- | |- | ||
− | | || <code>rgamma</code> || <code>dgamma</code> | + | | [[Gamma Distribution]] || <code>rgamma</code> || <code>dgamma</code> || <code>pgamma</code> |
|- | |- | ||
− | | || <code>rgeom</code> || <code>dgeom</code> | + | | [[Geometric Distribution]] || <code>rgeom</code> || <code>dgeom</code> || <code>pgeom</code> |
|- | |- | ||
− | | || <code>rhyper</code> || <code>dhyper</code> | + | | [[Hypergeometric Distribution]] || <code>rhyper</code> || <code>dhyper</code> || <code>phyper</code> |
|- | |- | ||
− | | || <code>rlogis</code> || <code>dlogis</code> | + | | [[Logistic Distribution]] || <code>rlogis</code> || <code>dlogis</code> || <code>plogis</code> |
|- | |- | ||
− | | || <code>rlnorm</code> || <code>dlnorm</code> | + | | [[Log Normal Distribution]] || <code>rlnorm</code> || <code>dlnorm</code> || <code>plnorm</code> |
|- | |- | ||
− | | || <code>rnbinom</code> || <code>dnbinom</code> | + | | [[Negative Binomial Distribution]] || <code>rnbinom</code> || <code>dnbinom</code> || <code>pnbinom</code> |
|- | |- | ||
− | | [[Normal Distribution]] || <code>rnorm</code> || <code>dnorm</code> | + | | [[Normal Distribution]] || <code>rnorm</code> || <code>dnorm</code> || <code>pnorm</code> |
|- | |- | ||
− | | || <code>rpois</code> || <code>dpois</code> | + | | [[Poisson Distribution]] || <code>rpois</code> || <code>dpois</code> || <code>ppois</code> |
|- | |- | ||
− | | || <code>rt</code> || <code>dt</code> | + | | [[t Distribution|$t$ Distribution]] || <code>rt</code> || <code>dt</code> || <code>pt</code> |
|- | |- | ||
− | | [[Uniform Distribution]] || <code>runif</code> || <code>dunif</code> | + | | [[Uniform Distribution]] || <code>runif</code> || <code>dunif</code> || <code>punif</code> |
|- | |- | ||
− | | || <code>rweibull</code> || <code>dweibull</code> | + | | [[Weibull Distribution]] || <code>rweibull</code> || <code>dweibull</code> || <code>pweibull</code> |
|} | |} | ||
− | === r<code>name</code>: [[ | + | === r<code>name</code>: [[Random Number Generator]] === |
− | + | Example 1 | |
+ | * Generate 10 random values from [[Normal Distribution]] | ||
* with standard deviation 3 and mean 188 | * with standard deviation 3 and mean 188 | ||
− | |||
<pre> | <pre> | ||
Line 52: | Line 52: | ||
− | Generates 10 random values from [[Binomial Distribution]] | + | Example 2 |
− | * flipping a coin 10 times | + | * Generates 10 random values from [[Binomial Distribution]] |
− | + | * flipping a coin 10 times = 10 independent experiments with probability 0.5 | |
<pre> | <pre> | ||
− | coinFlips = rbinom(10,size=10,prob=0.5) | + | coinFlips = rbinom(10, size=10, prob=0.5) |
> 3 4 6 5 7 6 5 8 5 6 | > 3 4 6 5 7 6 5 8 5 6 | ||
</pre> | </pre> | ||
Line 71: | Line 71: | ||
</pre> | </pre> | ||
− | + | Same with 15 : | |
<pre> | <pre> | ||
x = seq(from=-3, to=3, length=15) | x = seq(from=-3, to=3, length=15) | ||
Line 90: | Line 90: | ||
<pre> | <pre> | ||
− | x = seq(0,10,by=1) | + | x = seq(0, 10, by=1) |
− | binomialDensity = dbinom(x,size=10,prob=0.5) | + | binomialDensity = dbinom(x, size=10, prob=0.5) |
round(binomialDensity,2) | round(binomialDensity,2) | ||
</pre> | </pre> | ||
+ | |||
+ | === p<code>name</code>: [[Cumulative Distribution Function]] === | ||
+ | When you need to know what is the probability of $X \geqslant x$ for some $x$. | ||
+ | |||
+ | For example, you're doing an [[F-Test|$F$-Test]] | ||
+ | * you obtained $F = 3.446$ | ||
+ | * $F$ statistic follows the [[F Distribution|$F$ Distribution]]: $F \sim F(\text{df1}, \text{df2})$ | ||
+ | * so you can calculate the $p$-value: | ||
+ | |||
+ | 1 - pf(3.446, df1=1, df2=85) | ||
== [[Sampling]] == | == [[Sampling]] == | ||
Function <code>sample</code> draws a random sample | Function <code>sample</code> draws a random sample | ||
− | * <code>function(x, size, replace= FALSE, prob = NULL) </code> | + | * <code>function(x, size, replace=FALSE, prob=NULL) </code> |
− | * <code>replace = T</code> for sampling with replacement | + | * <code>replace=T</code> for sampling with replacement |
<pre> | <pre> | ||
Line 125: | Line 135: | ||
* note that 11 gets selected 3 times, | * note that 11 gets selected 3 times, | ||
* because the probability of selecting it is quite high: 0.3989 | * because the probability of selecting it is quite high: 0.3989 | ||
+ | |||
+ | |||
+ | === [[Bootstrapping]] === | ||
+ | It is very useful for [[Bootstrapping]] | ||
+ | |||
+ | reps = 1000 | ||
+ | n = length(data) | ||
+ | sampl = sample(data, size=n) | ||
+ | bs = replicate(reps, mean(sample(sampl, size=n, replace=T))) | ||
+ | |||
Name | RNG | CDF | |
---|---|---|---|
Beta Distribution | rbeta |
dbeta |
pbeta
|
Binomial Distribution | rbinom |
dbinom |
pbinom
|
Cauchy Distribution | rcauchy |
dcauchy |
pcauchy
|
$\chi^2$ Distribution | rchisq |
dchisq |
pchisq
|
Exponential Distribution | rexp |
dexp |
pexp
|
F Distribution | rf |
df |
pf
|
Gamma Distribution | rgamma |
dgamma |
pgamma
|
Geometric Distribution | rgeom |
dgeom |
pgeom
|
Hypergeometric Distribution | rhyper |
dhyper |
phyper
|
Logistic Distribution | rlogis |
dlogis |
plogis
|
Log Normal Distribution | rlnorm |
dlnorm |
plnorm
|
Negative Binomial Distribution | rnbinom |
dnbinom |
pnbinom
|
Normal Distribution | rnorm |
dnorm |
pnorm
|
Poisson Distribution | rpois |
dpois |
ppois
|
$t$ Distribution | rt |
dt |
pt
|
Uniform Distribution | runif |
dunif |
punif
|
Weibull Distribution | rweibull |
dweibull |
pweibull
|
name
: Random Number GeneratorExample 1
heights = rnorm(10, mean=188, sd=3) > 186.0 191.2 187.6 187.9 186.6 187.2 187.2 189.5 190.8 186.4
Example 2
coinFlips = rbinom(10, size=10, prob=0.5) > 3 4 6 5 7 6 5 8 5 6
name
: Probability Density FunctionCalculates the density of some probability distribution
x = seq(from=-5, to=5, length=10) normalDensity = dnorm(x, mean=0, sd=1) round(normalDensity, 2) [1] 0.00 0.00 0.01 0.10 0.34 0.34 0.10 0.01 0.00 0.00
Same with 15 :
x = seq(from=-3, to=3, length=15) normalDensity = dnorm(x, mean=0, sd=1) r = round(normalDensity, 2) bp = barplot(r) xspline(x=bp, y=r, lwd=2, shape=1, border="blue") text(x=bp, y=r+0.03, labels=as.character(r), xpd=TRUE, cex=0.7)
So we can see that it generates the values of the density function
Same for the Binomial distribution:
x = seq(0, 10, by=1) binomialDensity = dbinom(x, size=10, prob=0.5) round(binomialDensity,2)
name
: Cumulative Distribution FunctionWhen you need to know what is the probability of $X \geqslant x$ for some $x$.
For example, you're doing an $F$-Test
1 - pf(3.446, df1=1, df2=85)
Function sample
draws a random sample
function(x, size, replace=FALSE, prob=NULL)
replace=T
for sampling with replacements = seq(0, 20) > 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 sample(s, size=10) > 8 4 11 12 20 7 19 18 1 14 sample(s, size=10, replace=T) > 6 17 18 7 2 9 18 0 7 5
Note that 7 and 18 are selected twice for the sample with replacement
The sample can be draw with specified probability
dnorm(seq(-3, 3, length=length(s))) sample(s, size=10, replace=T, prob=n) > 9 7 11 11 1 13 11 14 5 6
It is very useful for Bootstrapping
reps = 1000 n = length(data) sampl = sample(data, size=n) bs = replicate(reps, mean(sample(sampl, size=n, replace=T)))
When we experiment, we typically want to reproduce it later
set.seed(12345)