Line 1: | Line 1: | ||

− | == $LU$ | + | == $LU$ Decomposition == |

− | This is the simplest | + | This is the simplest matrix decomposition that can be seen as a by-product of [[Gaussian Elimination]] |

+ | * The decomposition is $LU = A$ where | ||

+ | * $A$ is a square matrix | ||

+ | * $L$ and $U$ are [[Triangular Matrices]], $L$ is lower-triangular and $U$ is upper-triangular | ||

− | When we | + | When we perform [[Gaussian Elimination]], we can see each step as a single elimination matrix $E_i$ |

* $E_1 \cdots E_k A = U$, where $E_1 \cdots E_k$ are elimination matrices for each elimination step | * $E_1 \cdots E_k A = U$, where $E_1 \cdots E_k$ are elimination matrices for each elimination step | ||

* Let $E = E_1 \cdots E_k$ | * Let $E = E_1 \cdots E_k$ | ||

Line 8: | Line 11: | ||

=== $LU$ === | === $LU$ === | ||

+ | The $U$ part | ||

* $E_1 \cdots E_k \ A = U$ | * $E_1 \cdots E_k \ A = U$ | ||

* or $EA = U$ | * or $EA = U$ | ||

Line 14: | Line 18: | ||

* $L$ is lower-triangular | * $L$ is lower-triangular | ||

− | $L$ | + | The $L$ part |

* $L = E^{-1} = (E_1 \cdots E_k)^{-1} = E_k^{-1} \cdots E_1^{-1}$ | * $L = E^{-1} = (E_1 \cdots E_k)^{-1} = E_k^{-1} \cdots E_1^{-1}$ | ||

* $E_i$ have zeros up the diagonal, so when we inverse them, they become lower-diagonal | * $E_i$ have zeros up the diagonal, so when we inverse them, they become lower-diagonal | ||

* when we multiply a bunch of lower-diagonal matrices, we get a lower-diagonal matrix | * when we multiply a bunch of lower-diagonal matrices, we get a lower-diagonal matrix | ||

+ | |||

+ | |||

+ | Matrix Form | ||

+ | * how do we represent the zeroing process with a sequence of elimination matrices $E_i$ | ||

+ | * want some matrix that behaves in the following way: | ||

+ | ** <math>\begin{bmatrix} | ||

+ | 1 & 0 \\ | ||

+ | -\tau & 1 \\ | ||

+ | \end{bmatrix} | ||

+ | \begin{bmatrix} | ||

+ | v_1 \\ | ||

+ | v_2 \\ | ||

+ | \end{bmatrix} | ||

+ | = | ||

+ | \begin{bmatrix} | ||

+ | v_1 \\ | ||

+ | 0 \\ | ||

+ | \end{bmatrix} | ||

+ | </math> | ||

+ | * so let $E_k = I - \tau e_k^T$ where $e_k$ has 1 on the $k$th positions, and zeros elsewhere | ||

+ | * for larger matrices: | ||

+ | * <math>E_k v = \begin{bmatrix} | ||

+ | 1 & 0 & \cdots & 0 & 0 & 0 & \cdots & 0 \\ | ||

+ | 0 & 1 & \cdots & 0 & 0 & 0 & \cdots & 0 \\ | ||

+ | \vdots & \vdots & \ddots & \vdots & \vdots & \vdots & \ddots & \vdots \\ | ||

+ | 0 & 0 & \cdots & 1 & 0 & 0 & \cdots & 0 \\ | ||

+ | \hline | ||

+ | 0 & 0 & \cdots & -\tau_{k+1} & 1 & 0 & \cdots & 0 \\ | ||

+ | 0 & 0 & \cdots & -\tau_{k+2} & 0 & 1 & \cdots & 0 \\ | ||

+ | \vdots & \vdots & \ddots & \vdots & \vdots & \vdots & \ddots & \vdots \\ | ||

+ | 0 & 0 & \cdots & -\tau_{n} & 0 & 0 & \cdots & 1 \\ | ||

+ | \end{bmatrix} | ||

+ | \begin{bmatrix} | ||

+ | v_1 \\ | ||

+ | v_2 \\ | ||

+ | \vdots \\ | ||

+ | v_k \\ | ||

+ | \hline | ||

+ | v_{k+1} \\ | ||

+ | v_{k+2} \\ | ||

+ | \vdots \\ | ||

+ | v_{n} \\ | ||

+ | \end{bmatrix} | ||

+ | = | ||

+ | \begin{bmatrix} | ||

+ | v_1 \\ | ||

+ | v_2 \\ | ||

+ | \vdots \\ | ||

+ | v_k \\ | ||

+ | \hline | ||

+ | 0 \\ | ||

+ | 0 \\ | ||

+ | \vdots \\ | ||

+ | 0 \\ | ||

+ | \end{bmatrix}</math> | ||

+ | * so let $\tau^{(k)} = [0, 0, 0 | \tau_{k+1}, \tau_{k+2}, ..., \tau_{n}, ]$ | ||

+ | * and $E_k = I - \tau^{(k)} e_k^T$ | ||

+ | |||

+ | |||

+ | Computing $L$: | ||

+ | * $L = E_1^{-1} E_2^{-1} \cdots E_{n-1}^{-1} = I + \sum \tau^{(k)} e_k^T$ | ||

Types of $LU$ decomposition: | Types of $LU$ decomposition: | ||

* It's possible to have ones on the main diagonal of either $L$ or $U$ | * It's possible to have ones on the main diagonal of either $L$ or $U$ | ||

− | * Doolittle decomposition - the diagonal entries of $L$ are ones | + | * Doolittle decomposition - the diagonal entries of $L$ are ones, so $L$ is Unit-Triangular |

− | * Crout decomposition - the diagonal entries of $U$ are ones | + | * Crout decomposition - the diagonal entries of $U$ are ones, so $U$ is Unit-Triangular |

− | === $PLU$ Decomposition | + | === Properties === |

− | * | + | * $\text{det}(A) = \text{det}(LU) = \text{det}(L) \text{det}(U) = \text{det}(U) = u_{11} u_{22} ... u_{nn}$ |

+ | * so can use $A = LU$ decomposition for computing the determinant | ||

+ | |||

+ | |||

+ | === Pivoting and $PLU$ Decomposition === | ||

+ | * During elimination we can permute rows | ||

* We can express row permutation with the permutation matrix P | * We can express row permutation with the permutation matrix P | ||

* Then $PA$ is $A$ with permuted rows, so we have $E \, (PA) = U$ | * Then $PA$ is $A$ with permuted rows, so we have $E \, (PA) = U$ | ||

Line 33: | Line 103: | ||

− | === $LDU$ === | + | Why permute? |

+ | * To avoid division by small numbers | ||

+ | * Suppose <math>A = \begin{bmatrix} | ||

+ | 0.0001 & 1 \\ | ||

+ | 1 & 1 \\ | ||

+ | \end{bmatrix} | ||

+ | = | ||

+ | \begin{bmatrix} | ||

+ | 1 & 0 \\ | ||

+ | 10000 & 1 \\ | ||

+ | \end{bmatrix} | ||

+ | \begin{bmatrix} | ||

+ | -0.0001 & 1 \\ | ||

+ | 0 & -9999 \\ | ||

+ | \end{bmatrix} | ||

+ | = | ||

+ | LU | ||

+ | </math> | ||

+ | * if we pivot, then we have: | ||

+ | * <math>PA = \begin{bmatrix} | ||

+ | 1 & 1 \\ | ||

+ | 0.0001 & 1 \\ | ||

+ | \end{bmatrix} | ||

+ | = | ||

+ | \begin{bmatrix} | ||

+ | 1 & 0 \\ | ||

+ | 0.0001 & 1 \\ | ||

+ | \end{bmatrix} | ||

+ | \begin{bmatrix} | ||

+ | 1 & 1 \\ | ||

+ | 0 & 0.9999 \\ | ||

+ | \end{bmatrix} | ||

+ | = | ||

+ | LU | ||

+ | </math> | ||

+ | |||

+ | |||

+ | === Diagonalization and $LDU$ and $LDL^T$ Decomposition === | ||

We can go further: | We can go further: | ||

* and obtain factorization $A = LU = LDU^*$, where $D$ is diagonal | * and obtain factorization $A = LU = LDU^*$, where $D$ is diagonal | ||

* I.e. we factorize $U = DU^*$ | * I.e. we factorize $U = DU^*$ | ||

* this way both $L$ and $U$ will have ones on their diagonals - if $L$ had ones and $U$ didn't | * this way both $L$ and $U$ will have ones on their diagonals - if $L$ had ones and $U$ didn't | ||

+ | |||

+ | Why? | ||

+ | * Suppose that $A$ is [[Symmetric Matrices|Symmetric]] | ||

+ | * Then pivoting will destroy this nice structure | ||

+ | * So instead we should do $LDL^T$ decomposition | ||

+ | * Or something else, e.g. [[Cholesky Decomposition]] | ||

+ | |||

== Solving [[System of Linear Equations|$A \mathbf x = \mathbf b$]] with LU Decomposition == | == Solving [[System of Linear Equations|$A \mathbf x = \mathbf b$]] with LU Decomposition == | ||

Line 113: | Line 227: | ||

** since $PA = LU$, we have $LUX = P$ | ** since $PA = LU$, we have $LUX = P$ | ||

** then we solve $LY = P$ and $UX = Y$ | ** then we solve $LY = P$ and $UX = Y$ | ||

+ | |||

+ | |||

+ | == Usages == | ||

+ | In many cases expensive operations (such as matrix multiplication or inverse) can be made faster with LU Decomposition | ||

+ | |||

+ | === Solving $A^k \mathbf x = \mathbf b$ === | ||

+ | Suppose we want to solve $A^k \mathbf x = \mathbf b$ | ||

+ | * first computing $A^k$ and then solving the system | ||

+ | * or can use recursion | ||

+ | ** $A^{k} \mathbf x = \mathbf b$ | ||

+ | ** let $\mathbf y = A^{k-1} \mathbf x$ | ||

+ | ** so we have $A \mathbf y = \mathbf b$ | ||

+ | ** solve it, get $\mathbf y$ | ||

+ | ** now need to solve $A^{k-1} \mathbf x = \mathbf y$ | ||

+ | ** let $\mathbf b = \mathbf y$ and solve $A^{k-1} \mathbf x = \mathbf b$ | ||

+ | ** repeat | ||

+ | |||

+ | |||

+ | So, the algorithm: | ||

+ | * let $PA = LU$ | ||

+ | * for $j = 1, ..., k$, do | ||

+ | * solve $L \mathbf y = P \mathbf b$, store $\mathbf y$ | ||

+ | * solve $U \mathbf x = \mathbf y$, store $\mathbf x$ | ||

+ | * let $\mathbf b = \mathbf x$ | ||

+ | |||

+ | |||

+ | === Solving $S = C^T A^{-1} \mathbf d$ === | ||

+ | Want to compute $S = C^T A^{-1} \mathbf d$ | ||

+ | * can avoid computing the inverse of $A$ | ||

+ | * let $PA = LU$ | ||

+ | * solve $Ly = Pd$ and then $Ux = y$ | ||

+ | * then solve $S = S^T x$ | ||

+ | |||

+ | In general, if you see an inverse followed by a vector, you don't need to compure the inverse: | ||

+ | * let $A^{-1} d = x$ | ||

+ | * so $Ax = d$ | ||

+ | * solve that with $LU$, get $x$ | ||

+ | * compure $C^T x$ | ||

This is the simplest matrix decomposition that can be seen as a by-product of Gaussian Elimination

- The decomposition is $LU = A$ where
- $A$ is a square matrix
- $L$ and $U$ are Triangular Matrices, $L$ is lower-triangular and $U$ is upper-triangular

When we perform Gaussian Elimination, we can see each step as a single elimination matrix $E_i$

- $E_1 \cdots E_k A = U$, where $E_1 \cdots E_k$ are elimination matrices for each elimination step
- Let $E = E_1 \cdots E_k$

The $U$ part

- $E_1 \cdots E_k \ A = U$
- or $EA = U$
- let $L = E^{-1}$, so we have $A = LU$
- $U$ is upper-triangular by construction - because we eliminate all elements down the main diagonal
- $L$ is lower-triangular

The $L$ part

- $L = E^{-1} = (E_1 \cdots E_k)^{-1} = E_k^{-1} \cdots E_1^{-1}$
- $E_i$ have zeros up the diagonal, so when we inverse them, they become lower-diagonal
- when we multiply a bunch of lower-diagonal matrices, we get a lower-diagonal matrix

Matrix Form

- how do we represent the zeroing process with a sequence of elimination matrices $E_i$
- want some matrix that behaves in the following way:
- [math]\begin{bmatrix} 1 & 0 \\ -\tau & 1 \\ \end{bmatrix} \begin{bmatrix} v_1 \\ v_2 \\ \end{bmatrix} = \begin{bmatrix} v_1 \\ 0 \\ \end{bmatrix} [/math]

- so let $E_k = I - \tau e_k^T$ where $e_k$ has 1 on the $k$th positions, and zeros elsewhere
- for larger matrices:
- [math]E_k v = \begin{bmatrix} 1 & 0 & \cdots & 0 & 0 & 0 & \cdots & 0 \\ 0 & 1 & \cdots & 0 & 0 & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 1 & 0 & 0 & \cdots & 0 \\ \hline 0 & 0 & \cdots & -\tau_{k+1} & 1 & 0 & \cdots & 0 \\ 0 & 0 & \cdots & -\tau_{k+2} & 0 & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & -\tau_{n} & 0 & 0 & \cdots & 1 \\ \end{bmatrix} \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_k \\ \hline v_{k+1} \\ v_{k+2} \\ \vdots \\ v_{n} \\ \end{bmatrix} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_k \\ \hline 0 \\ 0 \\ \vdots \\ 0 \\ \end{bmatrix}[/math]
- so let $\tau^{(k)} = [0, 0, 0 | \tau_{k+1}, \tau_{k+2}, ..., \tau_{n}, ]$
- and $E_k = I - \tau^{(k)} e_k^T$

Computing $L$:

- $L = E_1^{-1} E_2^{-1} \cdots E_{n-1}^{-1} = I + \sum \tau^{(k)} e_k^T$

Types of $LU$ decomposition:

- It's possible to have ones on the main diagonal of either $L$ or $U$
- Doolittle decomposition - the diagonal entries of $L$ are ones, so $L$ is Unit-Triangular
- Crout decomposition - the diagonal entries of $U$ are ones, so $U$ is Unit-Triangular

- $\text{det}(A) = \text{det}(LU) = \text{det}(L) \text{det}(U) = \text{det}(U) = u_{11} u_{22} ... u_{nn}$
- so can use $A = LU$ decomposition for computing the determinant

- During elimination we can permute rows
- We can express row permutation with the permutation matrix P
- Then $PA$ is $A$ with permuted rows, so we have $E \, (PA) = U$
- So the decomposition becomes $PA = LU$

Why permute?

- To avoid division by small numbers
- Suppose [math]A = \begin{bmatrix} 0.0001 & 1 \\ 1 & 1 \\ \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 10000 & 1 \\ \end{bmatrix} \begin{bmatrix} -0.0001 & 1 \\ 0 & -9999 \\ \end{bmatrix} = LU [/math]
- if we pivot, then we have:
- [math]PA = \begin{bmatrix} 1 & 1 \\ 0.0001 & 1 \\ \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0.0001 & 1 \\ \end{bmatrix} \begin{bmatrix} 1 & 1 \\ 0 & 0.9999 \\ \end{bmatrix} = LU [/math]

We can go further:

- and obtain factorization $A = LU = LDU^*$, where $D$ is diagonal
- I.e. we factorize $U = DU^*$
- this way both $L$ and $U$ will have ones on their diagonals - if $L$ had ones and $U$ didn't

Why?

- Suppose that $A$ is Symmetric
- Then pivoting will destroy this nice structure
- So instead we should do $LDL^T$ decomposition
- Or something else, e.g. Cholesky Decomposition

Solving $A\mathbf x = \mathbf b$

- Can do that with Gaussian Elimination via LU Decomposition
- Let $LU = A$
- then $A \mathbf x = \mathbf b$ becomes $LU \mathbf x = \mathbf b$
- Now let $U \mathbf x = \mathbf y$ and $L \mathbf y = \mathbf x$
- With this substitution we need to solve two equations now
- Solve $L \mathbf y = \mathbf b$ to get $\mathbf y$
- Solve $U \mathbf x = \mathbf y$ to get $\mathbf x$

- These two equations are simpler than the original one because:
- $L$ is lower-diagonal, and $L \mathbf y = \mathbf b$ can be solved with "Forward Substitution" and
- $U$ is upper-diagonal, and $U \mathbf x = \mathbf y$ can be solved with "Back Substitution"

First, we solve $L \mathbf x = \mathbf b$

$2 \times 2$ case:

- [math]\begin{bmatrix} l_{11} & 0 \\ l_{21} & l_{22} \\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \\ \end{bmatrix} [/math]
- $x_1 = b_1 / l_{11}$
- $x_2 = (b_2 - l_{21} x_2) / l_{22}$

General procedure:

- $x_i = (b - \sum_{j=1}^{i - 1} l_{ij} x_j) / l_{ii}$

Then we solve $U \mathbf x = \mathbf b$

$2 \times 2$ case:

- [math]\begin{bmatrix} u_{11} & u_{12} \\ 0 & u_{22} \\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \\ \end{bmatrix} [/math]

General procedure:

- $x_i = (b - \sum_{j=i+1}^{n} u_{ij} x_j) / u_{ii}$

To find the matrix inverse, we need to solve the equation $AX = I$, where $X = A^{-1}$

- Let's decompose $A = LU$, so we have: $LUX = I$
- Now let $UX = Y$ and thus $LY = I$
- This gives us two systems:
- first solve $LY = I$ - it's easy because $L$ is lower-triangular, so we just do the forward pass and obtain $Y$
- then we solve $UX = Y$ - it's also easy because $U$ is upper-triangular

- after that we get $X$ which is our inverse
- if we need pivoting, then:
- we solve $PAX = PI = P$
- since $PA = LU$, we have $LUX = P$
- then we solve $LY = P$ and $UX = Y$

In many cases expensive operations (such as matrix multiplication or inverse) can be made faster with LU Decomposition

Suppose we want to solve $A^k \mathbf x = \mathbf b$

- first computing $A^k$ and then solving the system
- or can use recursion
- $A^{k} \mathbf x = \mathbf b$
- let $\mathbf y = A^{k-1} \mathbf x$
- so we have $A \mathbf y = \mathbf b$
- solve it, get $\mathbf y$
- now need to solve $A^{k-1} \mathbf x = \mathbf y$
- let $\mathbf b = \mathbf y$ and solve $A^{k-1} \mathbf x = \mathbf b$
- repeat

So, the algorithm:

- let $PA = LU$
- for $j = 1, ..., k$, do
- solve $L \mathbf y = P \mathbf b$, store $\mathbf y$
- solve $U \mathbf x = \mathbf y$, store $\mathbf x$
- let $\mathbf b = \mathbf x$

Want to compute $S = C^T A^{-1} \mathbf d$

- can avoid computing the inverse of $A$
- let $PA = LU$
- solve $Ly = Pd$ and then $Ux = y$
- then solve $S = S^T x$

In general, if you see an inverse followed by a vector, you don't need to compure the inverse:

- let $A^{-1} d = x$
- so $Ax = d$
- solve that with $LU$, get $x$
- compure $C^T x$