ML Wiki
Machine Learning Wiki - A collection of ML concepts, algorithms, and resources.

Bivariate Analysis

Bivariate Analysis

Analyzes relationships between two variables

Recall that there are the following Types of Variables

So there can be the following combinations:

  • Quantitative vs Quantitative
  • Quantitative vs Categorical
  • Categorical vs Categorical

Independence

typically most interesting question is:

  • “Are these variables independent”?
  • if they are dependent and correlated, then one variable can be redundant
  • and can be removed

Quantitative vs Quantitative

If two variables are numeric:

Quantitative vs Categorical

If one is numeric, and another is categorical:

Categorical vs Categorical

To compare two categorical variables

  • start from building a Contingency Table to show relative frequencies of values
    • '’Marginal distribution’’ - distribution of only one of the variables in a contingency table
    • '’Conditional Distribution’’ - distribution within a fixed value of a second variable
    • so it’s simple to see if there’s any correlation between the two variables just using this matrix
  • run some Tests of Independence:

Sources