Line 1: Line 1:
== $LU$ Factorization ==
+
== $LU$ Decomposition ==
This is the simplest factorization that can be seen as a by-product of [[Gaussian Elimination]]
+
This is the simplest matrix decomposition that can be seen as a by-product of [[Gaussian Elimination]]
 +
* The decomposition is $LU = A$ where
 +
* $A$ is a square matrix
 +
* $L$ and $U$ are [[Triangular Matrices]], $L$ is lower-triangular and $U$ is upper-triangular
  
When we do elimination, we have some elimination matrices:
+
When we perform [[Gaussian Elimination]], we can see each step as a single elimination matrix $E_i$
 
* $E_1 \cdots E_k A = U$, where $E_1 \cdots E_k$ are elimination matrices for each elimination step  
 
* $E_1 \cdots E_k A = U$, where $E_1 \cdots E_k$ are elimination matrices for each elimination step  
 
* Let $E = E_1 \cdots E_k$
 
* Let $E = E_1 \cdots E_k$
Line 8: Line 11:
  
 
=== $LU$ ===
 
=== $LU$ ===
 +
The $U$ part
 
* $E_1 \cdots E_k \ A = U$
 
* $E_1 \cdots E_k \ A = U$
 
* or $EA = U$
 
* or $EA = U$
Line 14: Line 18:
 
* $L$ is lower-triangular  
 
* $L$ is lower-triangular  
  
$L$
+
The $L$ part
 
* $L = E^{-1} = (E_1 \cdots E_k)^{-1} = E_k^{-1} \cdots E_1^{-1}$
 
* $L = E^{-1} = (E_1 \cdots E_k)^{-1} = E_k^{-1} \cdots E_1^{-1}$
 
* $E_i$ have zeros up the diagonal, so when we inverse them, they become lower-diagonal  
 
* $E_i$ have zeros up the diagonal, so when we inverse them, they become lower-diagonal  
 
* when we multiply a bunch of lower-diagonal matrices, we get a lower-diagonal matrix
 
* when we multiply a bunch of lower-diagonal matrices, we get a lower-diagonal matrix
 +
 +
 +
Matrix Form
 +
* how do we represent the zeroing process with a sequence of elimination matrices $E_i$
 +
* want some matrix that behaves in the following way:
 +
** <math>\begin{bmatrix}
 +
1 & 0 \\
 +
-\tau & 1 \\
 +
\end{bmatrix}
 +
\begin{bmatrix}
 +
v_1 \\
 +
v_2 \\
 +
\end{bmatrix}
 +
=
 +
\begin{bmatrix}
 +
v_1 \\
 +
0 \\
 +
\end{bmatrix}
 +
</math>
 +
* so let $E_k = I - \tau e_k^T$ where $e_k$ has 1 on the $k$th positions, and zeros elsewhere
 +
* for larger matrices:
 +
* <math>E_k v = \begin{bmatrix}
 +
1 & 0 &  \cdots & 0 & 0 & 0 &  \cdots & 0 \\
 +
0 & 1 &  \cdots & 0 & 0 & 0 &  \cdots & 0 \\
 +
\vdots & \vdots & \ddots & \vdots & \vdots & \vdots &  \ddots & \vdots \\
 +
0 & 0 &  \cdots & 1 & 0 & 0 & \cdots & 0 \\
 +
\hline
 +
0 & 0 &  \cdots & -\tau_{k+1} & 1 & 0 & \cdots & 0 \\
 +
0 & 0 &  \cdots & -\tau_{k+2} & 0 & 1 & \cdots & 0 \\
 +
\vdots & \vdots &  \ddots & \vdots & \vdots & \vdots &  \ddots & \vdots \\
 +
0 & 0 &  \cdots & -\tau_{n} &  0 & 0 & \cdots & 1 \\
 +
\end{bmatrix}
 +
\begin{bmatrix}
 +
v_1 \\
 +
v_2 \\
 +
\vdots \\
 +
v_k \\
 +
\hline
 +
v_{k+1} \\
 +
v_{k+2} \\
 +
\vdots \\
 +
v_{n} \\
 +
\end{bmatrix}
 +
=
 +
\begin{bmatrix}
 +
v_1 \\
 +
v_2 \\
 +
\vdots \\
 +
v_k \\
 +
\hline
 +
0 \\
 +
0 \\
 +
\vdots \\
 +
0 \\
 +
\end{bmatrix}</math>
 +
* so let $\tau^{(k)} = [0, 0, 0 | \tau_{k+1}, \tau_{k+2}, ..., \tau_{n},  ]$
 +
* and $E_k = I - \tau^{(k)} e_k^T$
 +
 +
 +
Computing $L$:
 +
* $L = E_1^{-1} E_2^{-1} \cdots E_{n-1}^{-1} = I + \sum \tau^{(k)} e_k^T$
  
  
 
Types of $LU$ decomposition:
 
Types of $LU$ decomposition:
 
* It's possible to have ones on the main diagonal of either $L$ or $U$  
 
* It's possible to have ones on the main diagonal of either $L$ or $U$  
* Doolittle decomposition - the diagonal entries of $L$ are ones
+
* Doolittle decomposition - the diagonal entries of $L$ are ones, so $L$ is Unit-Triangular
* Crout decomposition - the diagonal entries of $U$ are ones
+
* Crout decomposition - the diagonal entries of $U$ are ones, so $U$ is Unit-Triangular
  
  
=== $PLU$ Decomposition and Pivoting ===
+
=== Properties ===
* To avoid division by small numbers, we permute rows during the eliminations such that the largest element is the pivot
+
* $\text{det}(A) = \text{det}(LU) = \text{det}(L) \text{det}(U) = \text{det}(U) = u_{11} u_{22} ... u_{nn}$
 +
* so can use $A = LU$ decomposition for computing the determinant
 +
 
 +
 
 +
=== Pivoting and $PLU$ Decomposition ===
 +
* During elimination we can permute rows
 
* We can express row permutation with the permutation matrix P
 
* We can express row permutation with the permutation matrix P
 
* Then $PA$ is $A$ with permuted rows, so we have $E \, (PA) = U$
 
* Then $PA$ is $A$ with permuted rows, so we have $E \, (PA) = U$
Line 33: Line 103:
  
  
=== $LDU$ ===
+
Why permute?
 +
* To avoid division by small numbers
 +
* Suppose <math>A = \begin{bmatrix}
 +
0.0001 & 1 \\
 +
1 & 1 \\
 +
\end{bmatrix}
 +
=  
 +
\begin{bmatrix}
 +
1 & 0 \\
 +
10000 & 1 \\
 +
\end{bmatrix}
 +
\begin{bmatrix}
 +
-0.0001 & 1 \\
 +
0 & -9999 \\
 +
\end{bmatrix}
 +
=
 +
LU
 +
</math>
 +
* if we pivot, then we have:
 +
* <math>PA = \begin{bmatrix}
 +
1 & 1 \\
 +
0.0001 & 1 \\
 +
\end{bmatrix}
 +
=
 +
\begin{bmatrix}
 +
1 & 0 \\
 +
0.0001 & 1 \\
 +
\end{bmatrix}
 +
\begin{bmatrix}
 +
1 & 1 \\
 +
0 & 0.9999 \\
 +
\end{bmatrix}
 +
=
 +
LU
 +
</math>
 +
 
 +
 
 +
=== Diagonalization and $LDU$ and $LDL^T$ Decomposition ===
 
We can go further:
 
We can go further:
 
* and obtain factorization $A = LU = LDU^*$, where $D$ is diagonal
 
* and obtain factorization $A = LU = LDU^*$, where $D$ is diagonal
 
* I.e. we factorize $U = DU^*$
 
* I.e. we factorize $U = DU^*$
 
* this way both $L$ and $U$ will have ones on their diagonals - if $L$ had ones and $U$ didn't  
 
* this way both $L$ and $U$ will have ones on their diagonals - if $L$ had ones and $U$ didn't  
 +
 +
Why?
 +
* Suppose that $A$ is [[Symmetric Matrices|Symmetric]]
 +
* Then pivoting will destroy this nice structure
 +
* So instead we should do $LDL^T$ decomposition
 +
* Or something else, e.g. [[Cholesky Decomposition]]
 +
  
 
== Solving [[System of Linear Equations|$A \mathbf x = \mathbf b$]] with LU Decomposition ==
 
== Solving [[System of Linear Equations|$A \mathbf x = \mathbf b$]] with LU Decomposition ==
Line 113: Line 227:
 
** since $PA = LU$, we have $LUX = P$
 
** since $PA = LU$, we have $LUX = P$
 
** then we solve $LY = P$ and $UX = Y$
 
** then we solve $LY = P$ and $UX = Y$
 +
 +
 +
== Usages ==
 +
In many cases expensive operations (such as matrix multiplication or inverse) can be made faster with LU Decomposition
 +
 +
=== Solving $A^k \mathbf x = \mathbf b$ ===
 +
Suppose we want to solve $A^k \mathbf x = \mathbf b$
 +
* first computing $A^k$ and then solving the system
 +
* or can use recursion
 +
** $A^{k} \mathbf x = \mathbf b$
 +
** let $\mathbf y = A^{k-1} \mathbf x$
 +
** so we have $A \mathbf y = \mathbf b$
 +
** solve it, get $\mathbf y$
 +
** now need to solve $A^{k-1} \mathbf x = \mathbf y$
 +
** let $\mathbf b = \mathbf y$ and solve $A^{k-1} \mathbf x = \mathbf b$
 +
** repeat
 +
 +
 +
So, the algorithm:
 +
* let $PA = LU$
 +
* for $j = 1, ..., k$, do
 +
* solve $L \mathbf y = P \mathbf b$, store $\mathbf y$
 +
* solve $U \mathbf x = \mathbf y$, store $\mathbf x$
 +
* let $\mathbf b = \mathbf x$
 +
 +
 +
=== Solving $S = C^T A^{-1} \mathbf d$ ===
 +
Want to compute $S = C^T A^{-1} \mathbf d$
 +
* can avoid computing the inverse of $A$
 +
* let $PA = LU$
 +
* solve $Ly = Pd$ and then $Ux = y$
 +
* then solve $S = S^T x$
 +
 +
In general, if you see an inverse followed by a vector, you don't need to compure the inverse:
 +
* let $A^{-1} d = x$
 +
* so $Ax = d$
 +
* solve that with $LU$, get $x$
 +
* compure $C^T x$
  
  

Revision as of 16:26, 27 June 2017

$LU$ Decomposition

This is the simplest matrix decomposition that can be seen as a by-product of Gaussian Elimination

  • The decomposition is $LU = A$ where
  • $A$ is a square matrix
  • $L$ and $U$ are Triangular Matrices, $L$ is lower-triangular and $U$ is upper-triangular

When we perform Gaussian Elimination, we can see each step as a single elimination matrix $E_i$

  • $E_1 \cdots E_k A = U$, where $E_1 \cdots E_k$ are elimination matrices for each elimination step
  • Let $E = E_1 \cdots E_k$


$LU$

The $U$ part

  • $E_1 \cdots E_k \ A = U$
  • or $EA = U$
  • let $L = E^{-1}$, so we have $A = LU$
  • $U$ is upper-triangular by construction - because we eliminate all elements down the main diagonal
  • $L$ is lower-triangular

The $L$ part

  • $L = E^{-1} = (E_1 \cdots E_k)^{-1} = E_k^{-1} \cdots E_1^{-1}$
  • $E_i$ have zeros up the diagonal, so when we inverse them, they become lower-diagonal
  • when we multiply a bunch of lower-diagonal matrices, we get a lower-diagonal matrix


Matrix Form

  • how do we represent the zeroing process with a sequence of elimination matrices $E_i$
  • want some matrix that behaves in the following way:
    • [math]\begin{bmatrix} 1 & 0 \\ -\tau & 1 \\ \end{bmatrix} \begin{bmatrix} v_1 \\ v_2 \\ \end{bmatrix} = \begin{bmatrix} v_1 \\ 0 \\ \end{bmatrix} [/math]
  • so let $E_k = I - \tau e_k^T$ where $e_k$ has 1 on the $k$th positions, and zeros elsewhere
  • for larger matrices:
  • [math]E_k v = \begin{bmatrix} 1 & 0 & \cdots & 0 & 0 & 0 & \cdots & 0 \\ 0 & 1 & \cdots & 0 & 0 & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 1 & 0 & 0 & \cdots & 0 \\ \hline 0 & 0 & \cdots & -\tau_{k+1} & 1 & 0 & \cdots & 0 \\ 0 & 0 & \cdots & -\tau_{k+2} & 0 & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & -\tau_{n} & 0 & 0 & \cdots & 1 \\ \end{bmatrix} \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_k \\ \hline v_{k+1} \\ v_{k+2} \\ \vdots \\ v_{n} \\ \end{bmatrix} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_k \\ \hline 0 \\ 0 \\ \vdots \\ 0 \\ \end{bmatrix}[/math]
  • so let $\tau^{(k)} = [0, 0, 0 | \tau_{k+1}, \tau_{k+2}, ..., \tau_{n}, ]$
  • and $E_k = I - \tau^{(k)} e_k^T$


Computing $L$:

  • $L = E_1^{-1} E_2^{-1} \cdots E_{n-1}^{-1} = I + \sum \tau^{(k)} e_k^T$


Types of $LU$ decomposition:

  • It's possible to have ones on the main diagonal of either $L$ or $U$
  • Doolittle decomposition - the diagonal entries of $L$ are ones, so $L$ is Unit-Triangular
  • Crout decomposition - the diagonal entries of $U$ are ones, so $U$ is Unit-Triangular


Properties

  • $\text{det}(A) = \text{det}(LU) = \text{det}(L) \text{det}(U) = \text{det}(U) = u_{11} u_{22} ... u_{nn}$
  • so can use $A = LU$ decomposition for computing the determinant


Pivoting and $PLU$ Decomposition

  • During elimination we can permute rows
  • We can express row permutation with the permutation matrix P
  • Then $PA$ is $A$ with permuted rows, so we have $E \, (PA) = U$
  • So the decomposition becomes $PA = LU$


Why permute?

  • To avoid division by small numbers
  • Suppose [math]A = \begin{bmatrix} 0.0001 & 1 \\ 1 & 1 \\ \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 10000 & 1 \\ \end{bmatrix} \begin{bmatrix} -0.0001 & 1 \\ 0 & -9999 \\ \end{bmatrix} = LU [/math]
  • if we pivot, then we have:
  • [math]PA = \begin{bmatrix} 1 & 1 \\ 0.0001 & 1 \\ \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0.0001 & 1 \\ \end{bmatrix} \begin{bmatrix} 1 & 1 \\ 0 & 0.9999 \\ \end{bmatrix} = LU [/math]


Diagonalization and $LDU$ and $LDL^T$ Decomposition

We can go further:

  • and obtain factorization $A = LU = LDU^*$, where $D$ is diagonal
  • I.e. we factorize $U = DU^*$
  • this way both $L$ and $U$ will have ones on their diagonals - if $L$ had ones and $U$ didn't

Why?

  • Suppose that $A$ is Symmetric
  • Then pivoting will destroy this nice structure
  • So instead we should do $LDL^T$ decomposition
  • Or something else, e.g. Cholesky Decomposition


Solving $A \mathbf x = \mathbf b$ with LU Decomposition

Solving $A\mathbf x = \mathbf b$

  • Can do that with Gaussian Elimination via LU Decomposition
  • Let $LU = A$
  • then $A \mathbf x = \mathbf b$ becomes $LU \mathbf x = \mathbf b$
  • Now let $U \mathbf x = \mathbf y$ and $L \mathbf y = \mathbf x$
  • With this substitution we need to solve two equations now
    • Solve $L \mathbf y = \mathbf b$ to get $\mathbf y$
    • Solve $U \mathbf x = \mathbf y$ to get $\mathbf x$
  • These two equations are simpler than the original one because:
    • $L$ is lower-diagonal, and $L \mathbf y = \mathbf b$ can be solved with "Forward Substitution" and
    • $U$ is upper-diagonal, and $U \mathbf x = \mathbf y$ can be solved with "Back Substitution"


Forward Substitution

First, we solve $L \mathbf x = \mathbf b$

$2 \times 2$ case:

  • [math]\begin{bmatrix} l_{11} & 0 \\ l_{21} & l_{22} \\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \\ \end{bmatrix} [/math]
  • $x_1 = b_1 / l_{11}$
  • $x_2 = (b_2 - l_{21} x_2) / l_{22}$

General procedure:

  • $x_i = (b - \sum_{j=1}^{i - 1} l_{ij} x_j) / l_{ii}$


Back Substitution

Then we solve $U \mathbf x = \mathbf b$

$2 \times 2$ case:

  • [math]\begin{bmatrix} u_{11} & u_{12} \\ 0 & u_{22} \\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \\ \end{bmatrix} [/math]

General procedure:

  • $x_i = (b - \sum_{j=i+1}^{n} u_{ij} x_j) / u_{ii}$


Matrix Inversion

To find the matrix inverse, we need to solve the equation $AX = I$, where $X = A^{-1}$

  • Let's decompose $A = LU$, so we have: $LUX = I$
  • Now let $UX = Y$ and thus $LY = I$
  • This gives us two systems:
    • first solve $LY = I$ - it's easy because $L$ is lower-triangular, so we just do the forward pass and obtain $Y$
    • then we solve $UX = Y$ - it's also easy because $U$ is upper-triangular
  • after that we get $X$ which is our inverse
  • if we need pivoting, then:
    • we solve $PAX = PI = P$
    • since $PA = LU$, we have $LUX = P$
    • then we solve $LY = P$ and $UX = Y$


Usages

In many cases expensive operations (such as matrix multiplication or inverse) can be made faster with LU Decomposition

Solving $A^k \mathbf x = \mathbf b$

Suppose we want to solve $A^k \mathbf x = \mathbf b$

  • first computing $A^k$ and then solving the system
  • or can use recursion
    • $A^{k} \mathbf x = \mathbf b$
    • let $\mathbf y = A^{k-1} \mathbf x$
    • so we have $A \mathbf y = \mathbf b$
    • solve it, get $\mathbf y$
    • now need to solve $A^{k-1} \mathbf x = \mathbf y$
    • let $\mathbf b = \mathbf y$ and solve $A^{k-1} \mathbf x = \mathbf b$
    • repeat


So, the algorithm:

  • let $PA = LU$
  • for $j = 1, ..., k$, do
  • solve $L \mathbf y = P \mathbf b$, store $\mathbf y$
  • solve $U \mathbf x = \mathbf y$, store $\mathbf x$
  • let $\mathbf b = \mathbf x$


Solving $S = C^T A^{-1} \mathbf d$

Want to compute $S = C^T A^{-1} \mathbf d$

  • can avoid computing the inverse of $A$
  • let $PA = LU$
  • solve $Ly = Pd$ and then $Ux = y$
  • then solve $S = S^T x$

In general, if you see an inverse followed by a vector, you don't need to compure the inverse:

  • let $A^{-1} d = x$
  • so $Ax = d$
  • solve that with $LU$, get $x$
  • compure $C^T x$


Sources