ML Wiki
Machine Learning Wiki - A collection of ML concepts, algorithms, and resources.

Types of Variables

Types of Variables

When we have a table with data, rows correspond to ‘‘observation units’’ (subjects, etc.) and columns are ‘‘variables’’.

There are several types of variables:

  • Categorical Variables - values that can be organized into categories (not numerical)
  • Quantitative Variables - with numerical values for which arithmetic operation make sense
  • '’Ordinal Variables’’ - with natural order

Problems with Variables

Also we may have

  • Outliers - too large or too small values, sometimes they are errors, we have to find explanation for them
  • '’Missing values’’ - not present values, can bias the result
  • Noise - modification of the original value
    • Looks like normal input, but it’s faulty
    • Very hard to detect

Relationships

Types of variables in the analysis:

  • outcome - the variables of our interest
  • explanatory - the variables that are used to analyze and explain the outcome

Types of Relationships

The relationships between the explanatory variable and the outcome

  • '’independent’’: there is no association between the variables
  • '’association’’: the variables are dependent, but it’s not clear what kind of relationship there is
    • '’causes’’: changes in the explanatory variables case the outcome to change
    • '’reverse causation’’: changes in outcome cause the explanatory variable to change
    • '’coincidence’’: just pure chance
    • '’common cause’’: some other variable causes both the explanatory variables and the outcome to change - see Lurking Variables and Confounding Variables

Multivariate Analysis

To analyze relationships between variables there are following methods:

Sources