Before we do any Data Analysis, need to see if data is good
Why?
Problems:
summary(x)
- summarizes all quantitative and qualitative variablesquantile(x)
- range of variablessapply(x[1, ], class)
class
for every element of the 1st rownames(x)
Sizes:
dim(x)
- size of the datasetnrow(x)
and ncol(x)
tables
table(x)
- unique + countertable(x, y)
- two-dimensional table
logical tests
!
not, &
and, |
or:
which(!is.na(x) & x > 10)
sum(is.na(x))
- how many NAs
summarizing by columns or rows
rowSums
, rowMeans
colSums
, colMeans