Cost Matrix

Used for comparing two different models

  • A cost matrix is a matrix of the following form:
$y = +$ $y = -$
$h_\theta(x) = +$ +)$ -)$
$h_\theta(x) = -$ +)$ -)$


In general case:

  • $C(i | j)$
  • a cost of classifying an example of class $j$ as class $i$
  • this way we can express that some mispredictions are very costly


Example

$y = +$ $y = -$
$h_\theta(x) = +$ +) = -1$ -) = 1$
$h_\theta(x) = -$ +) = 100$ -) = 0$
  • we put $C(- | +) = 100$ because in this example false negatives are very costly

And assume we're comparing two classifiers $C_1$ and $C_2$

stats of $C_1$
$y = +$ $y = -$
$h_{C_1}(x) = +$ 150 60
$h_{C_1}(x) = -$ 40 250
  • $\text{acc}(C_1) = \cfrac{150+250}{150+40+60+250} = 80\%$
  • $\text{cost}(C_1) = -1 \cdot 150 + 1 \cdot 60 + 100 \cdot 40 + 0 \cdot 250 = 3910$
stats of $C_2$
$y = +$ $y = -$
$h_{C_2}(x) = +$ 250 5
$h_{C_2}(x) = -$ 45 200
  • $\text{acc}(C_2) = \cfrac{250+200}{250+45+5+200} = 90\%$
  • $\text{cost}(C_2) = -1 \cdot 250 + 1 \cdot 5 + 100 \cdot 45 + 0 \cdot 200 = 4255$

Selecting $C_1$

  • because $C_1$ has lower cost: $\text{cost}(C_1) < \text{cost}(C_2)$
  • even though $C_2$ has better accuracy: $\text{acc}(C_2) > \text{acc}(C_1)$


Sources