(→Column at a Time) |
|||
Line 117: | Line 117: | ||
== Implementation == | == Implementation == | ||
− | === | + | === Possible Ways to Implement === |
− | + | Suppose we want to implement the update $C = C + AB$: | |
− | + | * let $A$ be $m \times r$ matrix and $B$ be $r \times n$ | |
− | + | * then $C$ is $m \times n$ matrix | |
− | + | * $C$ can be initialized as an array filled with zeros - then $C$ will contain the results of $AB$ | |
− | + | ||
− | + | ||
− | + | ||
+ | for (int i = 0; i < m; i++) { | ||
+ | for (int j = 0; i < n; j++) { | ||
+ | for (int k = 0; k < r; k++) { | ||
+ | C[i, j] = C[i, j] + A[i, k] * B[k, j] | ||
+ | } | ||
+ | } | ||
+ | } | ||
− | === [[MapReduce]] === | + | |
− | * | + | Note that here it is not important in which order we do the loop: |
+ | * it can be <code>ijk</code>, <code>ikj</code>, etc, with $3! = 6$ possible ways | ||
+ | |||
+ | {| class="wikitable" | ||
+ | !Order||Inner Loop||Outer Loops||Inner Loop Access | ||
+ | |- | ||
+ | |<code>ijk</code>|| <code>dot</code> || Vector-Matrix Mult || $A$ row, $B$ col | ||
+ | |- | ||
+ | |<code>jik</code>|| <code>dot</code> || Matrix-Vector Mult || $A$ row, $B$ col | ||
+ | |- | ||
+ | |<code>ikj</code>|| <code>saxpy</code> || Row <code>gaxpy</code> || $B$ row, $C$ row | ||
+ | |- | ||
+ | |<code>jki</code>|| <code>saxpy</code> || Col <code>gaxpy</code> || $A$ col, $C$ col | ||
+ | |- | ||
+ | |<code>kij</code>|| <code>saxpy</code> || row outer product || $A$ row, $C$ row | ||
+ | |- | ||
+ | |<code>kji</code>|| <code>saxpy</code> || col outer product || $A$ col, $C$ col | ||
+ | |} | ||
+ | |||
+ | |||
+ | Some details: | ||
+ | * The way inner loop accesses the data is important for storing the data to make it faster - data should be contiguos in memory | ||
+ | * <code>dot</code> is usual Vector-Vector multiplication | ||
+ | * <code>saxpy</code> (scalar $a$ times $x$ plus $y$): $y \leftarrow ax + y$ | ||
+ | * <code>gaxpy</code> (generalized <code>axpy</code>): $y \leftarrow y + Ax$ | ||
+ | * outer product update: $A \leftarrow A + xy^T$ | ||
+ | |||
+ | |||
+ | |||
+ | === SQL and [[MapReduce]] === | ||
+ | Suppose we have two sparce matrices $A$ and $B$ | ||
+ | * we store these matrices as <code>(row_num, col_num, value)</code> triples | ||
+ | * it is easy to use SQL for this multiplication | ||
+ | |||
+ | |||
+ | Sparse matrix multiplication with SQL | ||
+ | |||
+ | select a.row_num, b.col_num, sum(a.value * b.value) | ||
+ | from A a, B b | ||
+ | where a.col_num = b.row_num | ||
+ | group by a.row_num, b.col_num; | ||
+ | |||
+ | |||
+ | [[MapReduce]] | ||
+ | * We can translate this SQL expression into MapReduce | ||
* Link [https://code.google.com/p/stolzen/source/browse/trunk/courses/coursera/Introduction%20to%20Data%20Science/assignment3/p6_matrixmult.py] | * Link [https://code.google.com/p/stolzen/source/browse/trunk/courses/coursera/Introduction%20to%20Data%20Science/assignment3/p6_matrixmult.py] | ||
Line 134: | Line 182: | ||
E.g. Apache Flink: | E.g. Apache Flink: | ||
− | + | matrixA.join(matrixB).where(1).equalTo(0) | |
− | matrixA.join(matrixB).where(1).equalTo(0) | + | .map(new ProjectJoinResultMapper()).groupBy(0, 1).sum(2) |
− | + | ||
− | + | ||
Line 150: | Line 196: | ||
* [[Introduction to Data Science (coursera)]] | * [[Introduction to Data Science (coursera)]] | ||
* [[Scalable Data Analytics and Data Mining AIM3 (TUB)]] | * [[Scalable Data Analytics and Data Mining AIM3 (TUB)]] | ||
+ | * [[Matrix Computations (book)]] | ||
[[Category:Linear Algebra]] | [[Category:Linear Algebra]] |
Suppose we want to multiply $m \times n$ matrix $A$ on $n \times p$ matrix $B$, we get an $m \times p$ matrix $C$
What is matrix-matrix multiplication in terms of Linear Transformations?
We can see matrix by matrix multiplication from 5 different positions:
All of them are equivalent and lead to the same result
This is usual dot product multiplication:
For each column $\mathbf{b}_j$ of $B$
For each row $\mathbf{a}^T_i$ of $A$
For $i$ from 1 to $n$,
$AB = \left[ \begin{array}{c|c} A_1 & A_2 \\ \hline A_3 & A_4 \end{array} \right] \times \left[ \begin{array}{c|c} B_1 & B_2 \\ \hline B_3 & B_4 \end{array} \right] = \left[ \begin{array}{c|c} A_1B_1 + A_2B_3 & A_3B_1 + A_4B_3 \\ \hline A_1B2 + A_2B_1 & A_3B_1 + A_4B_3 \end{array} \right] = C$
$(AB)^{-1} = B^{-1} A^{-1}$
Let's show the following:
Image of linear transformation
$C(A\, B) \subseteq C(A)$
What about $R(A \, B) \subseteq R(B)$?
Suppose we want to implement the update $C = C + AB$:
for (int i = 0; i < m; i++) { for (int j = 0; i < n; j++) { for (int k = 0; k < r; k++) { C[i, j] = C[i, j] + A[i, k] * B[k, j] } } }
Note that here it is not important in which order we do the loop:
ijk
, ikj
, etc, with $3! = 6$ possible waysOrder | Inner Loop | Outer Loops | Inner Loop Access |
---|---|---|---|
ijk |
dot |
Vector-Matrix Mult | $A$ row, $B$ col |
jik |
dot |
Matrix-Vector Mult | $A$ row, $B$ col |
ikj |
saxpy |
Row gaxpy |
$B$ row, $C$ row |
jki |
saxpy |
Col gaxpy |
$A$ col, $C$ col |
kij |
saxpy |
row outer product | $A$ row, $C$ row |
kji |
saxpy |
col outer product | $A$ col, $C$ col |
Some details:
dot
is usual Vector-Vector multiplicationsaxpy
(scalar $a$ times $x$ plus $y$): $y \leftarrow ax + y$gaxpy
(generalized axpy
): $y \leftarrow y + Ax$
Suppose we have two sparce matrices $A$ and $B$
(row_num, col_num, value)
triples
Sparse matrix multiplication with SQL
select a.row_num, b.col_num, sum(a.value * b.value) from A a, B b where a.col_num = b.row_num group by a.row_num, b.col_num;
E.g. Apache Flink:
matrixA.join(matrixB).where(1).equalTo(0) .map(new ProjectJoinResultMapper()).groupBy(0, 1).sum(2)
Full code of Matrix Multiplication in Flink: [2]