Line 3: Line 3:
 
=== [[Distributions]] ===
 
=== [[Distributions]] ===
 
{| class="wikitable"
 
{| class="wikitable"
! Name || [[Distribution Function|Function]] || [[Probability Density Function|Density]]
+
! Name || [[Random Number Generator|RNG]] || [[Probability Density Function|PDF]] || [[Cumulative Distribution Function|CDF]]
 
|-
 
|-
| || <code>rbeta</code> || <code>dbeta</code>
+
| [[Beta Distribution]] || <code>rbeta</code> || <code>dbeta</code> || <code>pbeta</code>
 
|-
 
|-
| [[Binomial Distribution]] || <code>rbinom</code> || <code>dbinom</code>
+
| [[Binomial Distribution]] || <code>rbinom</code> || <code>dbinom</code> || <code>pbinom</code>
 
|-
 
|-
| || <code>rcauchy</code> || <code>dcauchy</code>
+
| [[Cauchy Distribution]] || <code>rcauchy</code> || <code>dcauchy</code> || <code>pcauchy</code>
 
|-
 
|-
| || <code>rchisq</code> || <code>dchisq</code>
+
| [[Chi-Squared Distribution|$\chi^2$ Distribution]] || <code>rchisq</code> || <code>dchisq</code> || <code>pchisq</code>
 
|-
 
|-
| || <code>rexp</code> || <code>dexp</code>
+
| [[Exponential Distribution]] || <code>rexp</code> || <code>dexp</code> || <code>pexp</code>
 
|-
 
|-
| || <code>rf</code> || <code>df</code>
+
| [[F Distribution]] || <code>rf</code> || <code>df</code> || <code>pf</code>
 
|-
 
|-
| || <code>rgamma</code> || <code>dgamma</code>
+
| [[Gamma Distribution]] || <code>rgamma</code> || <code>dgamma</code> || <code>pgamma</code>
 
|-
 
|-
| || <code>rgeom</code> || <code>dgeom</code>
+
| [[Geometric Distribution]] || <code>rgeom</code> || <code>dgeom</code> || <code>pgeom</code>
 
|-
 
|-
| || <code>rhyper</code> || <code>dhyper</code>
+
| [[Hypergeometric Distribution]] || <code>rhyper</code> || <code>dhyper</code> || <code>phyper</code>
 
|-
 
|-
| || <code>rlogis</code> || <code>dlogis</code>
+
| [[Logistic Distribution]] || <code>rlogis</code> || <code>dlogis</code> || <code>plogis</code>
 
|-
 
|-
| || <code>rlnorm</code> || <code>dlnorm</code>
+
| [[Log Normal Distribution]] || <code>rlnorm</code> || <code>dlnorm</code> || <code>plnorm</code>
 
|-
 
|-
| || <code>rnbinom</code> || <code>dnbinom</code>
+
| [[Negative Binomial Distribution]] || <code>rnbinom</code> || <code>dnbinom</code> || <code>pnbinom</code>
 
|-
 
|-
| [[Normal Distribution]] || <code>rnorm</code> || <code>dnorm</code>
+
| [[Normal Distribution]] || <code>rnorm</code> || <code>dnorm</code> || <code>pnorm</code>
 
|-
 
|-
| || <code>rpois</code> || <code>dpois</code>
+
| [[Poisson Distribution]] || <code>rpois</code> || <code>dpois</code> || <code>ppois</code>
 
|-
 
|-
| || <code>rt</code> || <code>dt</code>
+
| [[t Distribution|$t$ Distribution]] || <code>rt</code> || <code>dt</code> || <code>pt</code>
 
|-
 
|-
| [[Uniform Distribution]] || <code>runif</code> || <code>dunif</code>
+
| [[Uniform Distribution]] || <code>runif</code> || <code>dunif</code> || <code>punif</code>
 
|-
 
|-
| || <code>rweibull</code> || <code>dweibull</code>
+
| [[Weibull Distribution]] || <code>rweibull</code> || <code>dweibull</code> || <code>pweibull</code>
 
|}
 
|}
  
  
=== r<code>name</code>: [[Distribution Function]] ===
+
=== r<code>name</code>: [[Random Number Generator]] ===
Generates 10 random values from [[Normal Distribution]]
+
Example 1
 +
* Generate 10 random values from [[Normal Distribution]]
 
* with standard deviation 3 and mean 188
 
* with standard deviation 3 and mean 188
 
  
 
<pre>
 
<pre>
Line 52: Line 52:
  
  
Generates 10 random values from [[Binomial Distribution]]
+
Example 2
* flipping a coin 10 times:
+
* Generates 10 random values from [[Binomial Distribution]]
* of 10 independent experiments with probability 0.5
+
* flipping a coin 10 times = 10 independent experiments with probability 0.5
  
 
<pre>
 
<pre>
coinFlips = rbinom(10,size=10,prob=0.5)
+
coinFlips = rbinom(10, size=10, prob=0.5)
 
> 3 4 6 5 7 6 5 8 5 6
 
> 3 4 6 5 7 6 5 8 5 6
 
</pre>
 
</pre>
Line 71: Line 71:
 
</pre>
 
</pre>
  
same with 15 :
+
Same with 15 :
 
<pre>
 
<pre>
 
x = seq(from=-3, to=3, length=15)
 
x = seq(from=-3, to=3, length=15)
Line 90: Line 90:
  
 
<pre>
 
<pre>
x = seq(0,10,by=1)
+
x = seq(0, 10, by=1)
binomialDensity = dbinom(x,size=10,prob=0.5)
+
binomialDensity = dbinom(x, size=10, prob=0.5)
 
round(binomialDensity,2)
 
round(binomialDensity,2)
 
</pre>
 
</pre>
  
 +
 +
=== p<code>name</code>: [[Cumulative Distribution Function]] ===
 +
When you need to know what is the probability of $X \geqslant x$ for some $x$.
 +
 +
For example, you're doing an [[F-Test|$F$-Test]]
 +
* you obtained $F = 3.446$
 +
* $F$ statistic follows the [[F Distribution|$F$ Distribution]]: $F \sim F(\text{df1}, \text{df2})$
 +
* so you can calculate the $p$-value:
 +
 +
1 - pf(3.446, df1=1, df2=85)
  
  
 
== [[Sampling]] ==
 
== [[Sampling]] ==
 
Function <code>sample</code> draws a random sample  
 
Function <code>sample</code> draws a random sample  
* <code>function(x, size, replace= FALSE, prob = NULL) </code>
+
* <code>function(x, size, replace=FALSE, prob=NULL) </code>
* <code>replace = T</code> for sampling with replacement
+
* <code>replace=T</code> for sampling with replacement
  
 
<pre>
 
<pre>
Line 125: Line 135:
 
* note that 11 gets selected 3 times,
 
* note that 11 gets selected 3 times,
 
* because the probability of selecting it is quite high: 0.3989
 
* because the probability of selecting it is quite high: 0.3989
 +
 +
 +
=== [[Bootstrapping]] ===
 +
It is very useful for [[Bootstrapping]]
 +
 +
reps = 1000
 +
n = length(data)
 +
sampl = sample(data, size=n)
 +
bs = replicate(reps, mean(sample(sampl, size=n, replace=T)))
 +
  
  

Latest revision as of 10:23, 18 November 2015

Simulation in R

Distributions

Name RNG PDF CDF
Beta Distribution rbeta dbeta pbeta
Binomial Distribution rbinom dbinom pbinom
Cauchy Distribution rcauchy dcauchy pcauchy
$\chi^2$ Distribution rchisq dchisq pchisq
Exponential Distribution rexp dexp pexp
F Distribution rf df pf
Gamma Distribution rgamma dgamma pgamma
Geometric Distribution rgeom dgeom pgeom
Hypergeometric Distribution rhyper dhyper phyper
Logistic Distribution rlogis dlogis plogis
Log Normal Distribution rlnorm dlnorm plnorm
Negative Binomial Distribution rnbinom dnbinom pnbinom
Normal Distribution rnorm dnorm pnorm
Poisson Distribution rpois dpois ppois
$t$ Distribution rt dt pt
Uniform Distribution runif dunif punif
Weibull Distribution rweibull dweibull pweibull


rname: Random Number Generator

Example 1

heights = rnorm(10, mean=188, sd=3)
> 186.0 191.2 187.6 187.9 186.6 187.2 187.2 189.5 190.8 186.4


Example 2

  • Generates 10 random values from Binomial Distribution
  • flipping a coin 10 times = 10 independent experiments with probability 0.5
coinFlips = rbinom(10, size=10, prob=0.5)
> 3 4 6 5 7 6 5 8 5 6


dname: Probability Density Function

Calculates the density of some probability distribution

x = seq(from=-5, to=5, length=10)
normalDensity = dnorm(x, mean=0, sd=1)
round(normalDensity, 2)
[1] 0.00 0.00 0.01 0.10 0.34 0.34 0.10 0.01 0.00 0.00

Same with 15 :

x = seq(from=-3, to=3, length=15)
normalDensity = dnorm(x, mean=0, sd=1)
r = round(normalDensity, 2)
bp = barplot(r)
xspline(x=bp, y=r, lwd=2, shape=1, border="blue")
text(x=bp, y=r+0.03, labels=as.character(r), xpd=TRUE, cex=0.7)

Code [1] [2]

So we can see that it generates the values of the density function


Same for the Binomial distribution:

x = seq(0, 10, by=1)
binomialDensity = dbinom(x, size=10, prob=0.5)
round(binomialDensity,2)


pname: Cumulative Distribution Function

When you need to know what is the probability of $X \geqslant x$ for some $x$.

For example, you're doing an $F$-Test

  • you obtained $F = 3.446$
  • $F$ statistic follows the $F$ Distribution: $F \sim F(\text{df1}, \text{df2})$
  • so you can calculate the $p$-value:
1 - pf(3.446, df1=1, df2=85)


Sampling

Function sample draws a random sample

  • function(x, size, replace=FALSE, prob=NULL)
  • replace=T for sampling with replacement
s = seq(0, 20)
> 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
sample(s, size=10)
> 8  4 11 12 20  7 19 18  1 14
sample(s, size=10, replace=T)
> 6 17 18  7  2  9 18  0  7  5

Note that 7 and 18 are selected twice for the sample with replacement


The sample can be draw with specified probability

  • e.g. suppose we want to sample with normal distribution


dnorm(seq(-3, 3, length=length(s)))
sample(s, size=10, replace=T, prob=n)
> 9  7 11 11  1 13 11 14  5  6 
  • note that 11 gets selected 3 times,
  • because the probability of selecting it is quite high: 0.3989


Bootstrapping

It is very useful for Bootstrapping

reps = 1000
n = length(data)
sampl = sample(data, size=n)
bs = replicate(reps, mean(sample(sampl, size=n, replace=T)))


Reproducibility

When we experiment, we typically want to reproduce it later

  • so it's important to generate the same "random" data
  • for that we can set the seed for PRG
  • set.seed(12345)


Source