ML Wiki
Machine Learning Wiki - A collection of ML concepts, algorithms, and resources.

Box Plot

Box Plot

Box Plot

General idea:

Image

  • IQR = Q3 - Q1 - the length of the box
  • whiskers (fences) capture data outside of the box
boxplot(..., range=0, ...)
boxplot(..., horizontal=T, ...) // horizontal boxplot

range=0 means that it will show usual box plot.

Modified Box Plot

Modified box plot can be used to show Outliers

  • IQR (Inter Quartile Range) - difference between 3rd and 1st quartile
  • Inner fences - the values that are 1.5 times the IQR beyond the 1st and 3rd quartile
  • Lower inner fence = 1st quartile - (1.5 x IQR)
  • Upper inner fence = 3rd quartile + (1.5 x IQR)
  • observations beyond the whiskers (fences) are outliers and marked with dots

Image

In R

  • by default boxplot shows modified box plot
  • IQR(data) shows the IQR

Bivariate Analysis

We can calculate all 5 number values for all quantitative variables associated with a specific category.

  • And for each category get a box plot
  • With box plots, we also can see how two values interact
  • Image

R

boxplot(d$a ~ as.factor(d$f))
  • it will show separate boxplot of values in $a$ for each values of $f$
  • Image
boxplot(d$a ~ as.factor(d$f), col=c("blue","orange"), names=c("yes","no"), varwidth=T)
  • if we want to show how much data is there for each factor,
  • we can make the with of the boxes proportional to the volume of data
  • using varwidth=T
  • Image

Box Plot with Other Plots

Box plots are nice to combine with other plots

See Also

Sources