Line 17: | Line 17: | ||

== Motivating Example == | == Motivating Example == | ||

− | * Let | + | * Let <math>A = \begin{bmatrix} |

2 & 6 \\ | 2 & 6 \\ | ||

6 & 18 \\ | 6 & 18 \\ | ||

− | \end{bmatrix} | + | \end{bmatrix}</math> |

* then for any $\mathbf x = (x_1, x_2)$ we want to check | * then for any $\mathbf x = (x_1, x_2)$ we want to check | ||

− | * | + | * <math>\big[x_1 \ x_2 \big] \begin{bmatrix} |

2 & 6 \\ | 2 & 6 \\ | ||

6 & 18 \\ | 6 & 18 \\ | ||

Line 28: | Line 28: | ||

x_1 \\ | x_1 \\ | ||

x_2 \\ | x_2 \\ | ||

− | \end{bmatrix} = 2 \, x_1^2 + 12 \, x_1 x_2 + 18 \, x_2^2 | + | \end{bmatrix} = 2 \, x_1^2 + 12 \, x_1 x_2 + 18 \, x_2^2</math> |

* note that this is not a linear anymore: | * note that this is not a linear anymore: | ||

* we have an equation $a x_1^2 + 2b \, x_1 x_2 + c \, x_2^2$ | * we have an equation $a x_1^2 + 2b \, x_1 x_2 + c \, x_2^2$ | ||

Line 39: | Line 39: | ||

Another example | Another example | ||

− | * let | + | * let <math>A_1 = \begin{bmatrix} |

2 & 6 \\ | 2 & 6 \\ | ||

6 & 7 \\ | 6 & 7 \\ | ||

− | \end{bmatrix} | + | \end{bmatrix}</math> |

* then $f(\mathbf x) = \mathbf x^T A_1 \, \mathbf x = 2 x_1^2 + 12 x_1 x_2 + 7 x_2^2$ | * then $f(\mathbf x) = \mathbf x^T A_1 \, \mathbf x = 2 x_1^2 + 12 x_1 x_2 + 7 x_2^2$ | ||

* there exists $\mathbf x$ such that $f(\mathbf x) < 0$, e.g. $(1, -1)$ | * there exists $\mathbf x$ such that $f(\mathbf x) < 0$, e.g. $(1, -1)$ | ||

Line 50: | Line 50: | ||

Consider an alternative: | Consider an alternative: | ||

− | * | + | * <math>A_2 = \begin{bmatrix} |

2 & 6 \\ | 2 & 6 \\ | ||

6 & 20 \\ | 6 & 20 \\ | ||

− | \end{bmatrix} | + | \end{bmatrix}</math> |

* $f(\mathbf x) = \mathbf x^T A_2 \, \mathbf x = 2 x_1^2 + 12 x_1 x_2 + 20 x_2^2$ | * $f(\mathbf x) = \mathbf x^T A_2 \, \mathbf x = 2 x_1^2 + 12 x_1 x_2 + 20 x_2^2$ | ||

* here squares always overwhelm $12 x_1 x_2$ | * here squares always overwhelm $12 x_1 x_2$ | ||

Line 89: | Line 89: | ||

* the numbers in the completed square form come from [[Gaussian Elimination]]! | * the numbers in the completed square form come from [[Gaussian Elimination]]! | ||

* Let's do $A = LU$ transformation: | * Let's do $A = LU$ transformation: | ||

− | ** | + | ** <math>L = \begin{bmatrix} |

1 & 0 \\ | 1 & 0 \\ | ||

3 & 1 \\ | 3 & 1 \\ | ||

Line 95: | Line 95: | ||

\boxed 2 & 6 \\ | \boxed 2 & 6 \\ | ||

0 & \boxed 2 \\ | 0 & \boxed 2 \\ | ||

− | \end{bmatrix} | + | \end{bmatrix}</math> |

* multipliers before squares come from pivots of $U$: | * multipliers before squares come from pivots of $U$: | ||

** $\boxed{2} \, (x_1 + 3 \, x_2)^2 + \boxed 2 \, x_2^2$ | ** $\boxed{2} \, (x_1 + 3 \, x_2)^2 + \boxed 2 \, x_2^2$ | ||

Line 105: | Line 105: | ||

=== Derivative Matrix === | === Derivative Matrix === | ||

So a matrix of second derivatives ([[Hessian Matrix]]) is | So a matrix of second derivatives ([[Hessian Matrix]]) is | ||

− | * | + | * <math>\begin{bmatrix} |

\cfrac{\partial x_1^2}{\partial^2 x_1} & \cfrac{\partial x_1 \partial x_2}{\partial x_1 \partial x_2} \\ | \cfrac{\partial x_1^2}{\partial^2 x_1} & \cfrac{\partial x_1 \partial x_2}{\partial x_1 \partial x_2} \\ | ||

\cfrac{\partial x_2 \partial x_1}{\partial x_2 \partial x_1} & \cfrac{\partial x_2^2}{\partial^2 x_2} \\ | \cfrac{\partial x_2 \partial x_1}{\partial x_2 \partial x_1} & \cfrac{\partial x_2^2}{\partial^2 x_2} \\ | ||

− | \end{bmatrix} | + | \end{bmatrix}</math> |

* we want it to be positive-definite | * we want it to be positive-definite | ||

* then the function $f(\mathbf x) = \mathbf x^T A \, \mathbf x$ is positive-definite | * then the function $f(\mathbf x) = \mathbf x^T A \, \mathbf x$ is positive-definite | ||

Line 145: | Line 145: | ||

* so eigenvalues are also positive | * so eigenvalues are also positive | ||

* but careful with semi-positive definite matrices: they do not have an inverse! | * but careful with semi-positive definite matrices: they do not have an inverse! | ||

− | |||

Line 151: | Line 150: | ||

They are always semi-positive definite | They are always semi-positive definite | ||

* see [[Gram Matrices]] | * see [[Gram Matrices]] | ||

− | |||

− | |||

In Linear Algebra, a matrix an $n \times n$ matrix is Positive-definite matrix (PDM) if

- $\mathbf v^T A \mathbf v > 0$ for all $\mathbf v \in \mathbb R^n$
- This is the energy based definition

Why *energy*?

- because $\mathbf v^T A \mathbf v$ or $\frac{1}{2} \mathbf v^T A \mathbf v$ is called the
*energy*of the system $A$

- A matrix is semi-positive definite if
- $\mathbf v^T A \mathbf v \geqslant 0$ for all $\mathbf v \ne \mathbf 0 \in \mathbb R^n$
- so some eigenvectors can be 0

- Let [math]A = \begin{bmatrix} 2 & 6 \\ 6 & 18 \\ \end{bmatrix}[/math]
- then for any $\mathbf x = (x_1, x_2)$ we want to check
- [math]\big[x_1 \ x_2 \big] \begin{bmatrix} 2 & 6 \\ 6 & 18 \\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix} = 2 \, x_1^2 + 12 \, x_1 x_2 + 18 \, x_2^2[/math]
- note that this is not a linear anymore:
- we have an equation $a x_1^2 + 2b \, x_1 x_2 + c \, x_2^2$
- this is a Quadratic Form
- we want to know if this quantity is always positive or not
- are there such $x_1, x_2$ that $a x_1^2 + 2b \, x_1 x_2 + c \, x_2^2 < 0$?

So we have a function $f(\mathbf x) = \mathbf x^T A \, \mathbf x$ and we want to check if it's always positive for any $\mathbf x$

Another example

- let [math]A_1 = \begin{bmatrix} 2 & 6 \\ 6 & 7 \\ \end{bmatrix}[/math]
- then $f(\mathbf x) = \mathbf x^T A_1 \, \mathbf x = 2 x_1^2 + 12 x_1 x_2 + 7 x_2^2$
- there exists $\mathbf x$ such that $f(\mathbf x) < 0$, e.g. $(1, -1)$
- in this system, there's a Saddle Point - a max for one direction and min for another

Consider an alternative:

- [math]A_2 = \begin{bmatrix} 2 & 6 \\ 6 & 20 \\ \end{bmatrix}[/math]
- $f(\mathbf x) = \mathbf x^T A_2 \, \mathbf x = 2 x_1^2 + 12 x_1 x_2 + 20 x_2^2$
- here squares always overwhelm $12 x_1 x_2$

We say that $A_1$ is *indefinite*, and $A_2$ is *positive-definite*

Source: [1]

Recall from Calculus:

- 1st Derivative is needed for finding extremum, but you don't know if it's min or max
- so you have to look for the 2nd derivative to learn if it's positive or negative
- you want to find $\cfrac{du}{dx} = 0$ and $\cfrac{d^2 \, u}{d \, x^2} > 0$

Consider $A_2$ again:

- $f(\mathbf x) = \mathbf x^T A_1 \, \mathbf x = 2 x_1^2 + 12 x_1 x_2 + 20 x_2^2$
- Let's complete the square: $2 \, (x_1 + 3 \, x_2)^2 + 2 \, x_2^2$
- now it's easy to see that this function is indeed always positive: we completed the square and there are no negative terms

What about $A_1$?

- $f(\mathbf x) = \mathbf x^T A_1 \, \mathbf x = 2 x_1^2 + 12 x_1 x_2 + 7 x_2^2$
- let's try to complete the square: $2 \, (x_1 + 3 \, x_2)^2 - 11 \, x_2^2$
- we have a minus!

Let's have a look again at $A_2$:

- $f(\mathbf x) = \mathbf x^T A_1 \, \mathbf x = 2 x_1^2 + 12 x_1 x_2 + 20 x_2^2 = 2 \, (x_1 + 3 \, x_2)^2 + 2 \, x_2^2$
- the numbers in the completed square form come from Gaussian Elimination!
- Let's do $A = LU$ transformation:
- [math]L = \begin{bmatrix} 1 & 0 \\ 3 & 1 \\ \end{bmatrix}, U = \begin{bmatrix} \boxed 2 & 6 \\ 0 & \boxed 2 \\ \end{bmatrix}[/math]

- multipliers before squares come from pivots of $U$:
- $\boxed{2} \, (x_1 + 3 \, x_2)^2 + \boxed 2 \, x_2^2$

- coefficients inside each square come from $L$
- $2 \, (1 \, x_1 + 3 \, x_2)^2 + 2 \, (0\, x_1 + 1 \, x_2)^2$

- so positive pivots of $U$ are good

So a matrix of second derivatives (Hessian Matrix) is

- [math]\begin{bmatrix} \cfrac{\partial x_1^2}{\partial^2 x_1} & \cfrac{\partial x_1 \partial x_2}{\partial x_1 \partial x_2} \\ \cfrac{\partial x_2 \partial x_1}{\partial x_2 \partial x_1} & \cfrac{\partial x_2^2}{\partial^2 x_2} \\ \end{bmatrix}[/math]
- we want it to be positive-definite
- then the function $f(\mathbf x) = \mathbf x^T A \, \mathbf x$ is positive-definite

So, how to check for positive definitiveness?

- using the definition: check that $\mathbf v^T A \mathbf v > 0$ for all $\mathbf v$
- check that all eigenvalues are positive
- or that all pivots of $L$ in $A = LU$ are positive
- or that all Determinants and sub-determinants are positive

Checking using positiveness of eigenvalues:

- if for all $\mathbf v$, $\mathbf v^T A \, \mathbf v > 0$,
- $A \mathbf v = \lambda \mathbf v$, multiply by $\mathbf v^T$ on the left
- $\mathbf v^T A \, \mathbf v = \lambda \mathbf v^T \mathbf v$
- $\mathbf v^T A \, \mathbf v = \lambda \| \mathbf v \|^2$
- $\| \mathbf v \|^2$ is always positive, so it means that if $\lambda > 0$, then so is $\mathbf v^T A \, \mathbf v$
- therefore we can check if all eigenvalues are positive

If $A$ and $B$ are both PDM

- then so is $A + B$
- because the energies add when we add matrices:
- $\mathbf v^T A \, \mathbf v + \mathbf v^T B \, \mathbf v$

- if $A$ is PDM, the inverse is also PDM
- eigenvalues of the inverse are $\lambda^*_i = \frac{1}{\lambda_i}$
- so eigenvalues are also positive
- but careful with semi-positive definite matrices: they do not have an inverse!

They are always semi-positive definite

- see Gram Matrices

- Linear Algebra MIT 18.06 (OCW)
- Jauregui, Jeff. "Principal component analysis with linear algebra." (2012). [2]