Math 511: Linear Algebra
2.1 Matrix Arithmetic
2.1.1 Matrices¶
$$ \require{color} \definecolor{brightblue}{rgb}{.267, .298, .812} \definecolor{darkblue}{rgb}{0.0, 0.0, 1.0} \definecolor{palepink}{rgb}{1, .73, .8} \definecolor{softmagenta}{rgb}{.99,.34,.86} \definecolor{blueviolet}{rgb}{.537,.192,.937} \definecolor{jonquil}{rgb}{.949,.792,.098} \definecolor{shockingpink}{rgb}{1, 0, .741} \definecolor{royalblue}{rgb}{0, .341, .914} \definecolor{alien}{rgb}{.529,.914,.067} \definecolor{crimson}{rgb}{1, .094, .271} \def\ihat{\mathbf{\hat{\unicode{x0131}}}} \def\jhat{\mathbf{\hat{\unicode{x0237}}}} \def\khat{\mathrm{\hat{k}}} \def\tombstone{\unicode{x220E}} \def\contradiction{\unicode{x2A33}} $$
In Section 1.1 we introduced systems of linear equations
$$ \begin{array}{rcl} a_{11}x_1 + a_{12}x_2 +\ \cdots\ + a_{1n}x_n & = & b_1 \\ a_{21}x_1 + a_{22}x_2 +\ \cdots\ + a_{2n}x_n & = & b_2 \\ \\ \ddots\, \ \ +\ \ \ddots\, \ \ +\ \cdots\ +\ \ \ \ddots\, & = & \vdots \\ \\ a_{m1}x_1 + a_{m2}x_2 +\ \cdots + a_{mn}x_n & = & b_m \end{array} $$
$$ x_1\begin{bmatrix} \ a_{11}\ \\ \ a_{21}\ \\ \ \ddots\ \\ \ a_{m1}\ \end{bmatrix} + x_2\begin{bmatrix}\ a_{12} \\ \ a_{22} \\ \ \ddots \\ \ a_{m2} \end{bmatrix} + \cdots + x_n\begin{bmatrix}\ a_{1n} \\ \ a_{2n} \\ \ \ddots \\ \ a_{mn} \end{bmatrix} = \begin{bmatrix}\ b_1 \\ \ b_2 \\ \ \vdots \\ \ b_m \end{bmatrix} $$
We could make our representation even more efficient by packaging the columns into a single matrix and define matrix vector multiplication so that
$$ x_1\begin{bmatrix} \ a_{11} \\ \ a_{21} \\ \ \dots \\ \ a_{m1} \end{bmatrix} + x_2\begin{bmatrix} \ a_{12} \\ \ a_{22} \\ \ \ddots \\ \ a_{m2} \end{bmatrix} + \cdots + x_n\begin{bmatrix}\ a_{1n} \\ \ a_{2n} \\ \ \ddots \\ \ a_{mn} \end{bmatrix} = \begin{bmatrix} \ a_{11} & \ a_{12} & \ \cdots & \ a_{1n} \\ \ a_{21} & \ a_{22} & \ \cdots &\ a_{2n} \\ \ \ddots & \ \ddots & \ \cdots & \ \ddots \\ \ a_{m1} & \ a_{m2} & \ \cdots & \ a_{mn} \end{bmatrix}\begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} $$
we mean $x_1$ times the first column of the matrix plus $x_2$ times the second column etc. In this way, matrix-vector multiplication is defined to the linear combination of column vectors
$$ x_1\begin{bmatrix} \ a_{11} \\ \ a_{12} \\ \ \ddots \\ \ a_{1n} \end{bmatrix} + x_2\begin{bmatrix} \ a_{21} \\ \ a_{22} \\ \ \ddots \\ \ a_{1n} \end{bmatrix} + \cdots + x_n\begin{bmatrix}\ a_{m1} \\ \ a_{m2} \\ \ \ddots \\ \ a_{mn} \end{bmatrix} = x_1\mathbf{a}_1 + x_2\mathbf{a_2} + \cdots + x_n\mathbf{a}_n. $$
If we substitute our new matrix-vector multiplication into our equation we obtain the matrix equation
$$ \begin{bmatrix} \ \mathbf{a}_1 &\ \mathbf{a}_2 &\ \cdots &\ \mathbf{a}_n \end{bmatrix}\mathbf{x} = \mathbf{b} $$
We call the matrix in our matrix-vector multiplication the coefficient matrix because the elements of the matrix are the coefficients of the variables from the system of linear equations. We will use capital letters to represent matrices, lower case letters in bold to represent vectors and Greek alphabet letters to represent scalars. We only use this convention so that our linear algebra equations will be easy to read.
$$ A = \begin{bmatrix} \ \mathbf{a}_1 & \ \mathbf{a}_2 & \ \cdots & \ \mathbf{a}_n \end{bmatrix} $$
is the matrix made up of column vectors $\mathbf{a}_j$, $1\le j\le n$. Each column vector $\mathbf{a}_j$ is an $m\times 1$ column vector. We can call our vector of independent variables
$$ \mathbf{x} = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix}. $$
We can call our vector of constants on the right-hand side of the equation the vector $\mathbf{b}$
$$ \mathbf{b} = \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_m \end{bmatrix}. $$
Our equation becomes
$$ A\mathbf{x} = \mathbf{b}. $$
It is important that we learn the notation for simplifying our matrix algebra. We indicate the column vectors of matrix using the lower case letter. We indicate the symbol represents a vector by using bold face and/or placing a bar over or under the lower case letter. We indicate the specific column with a subscript. The third column of matrix $C$ is denoted
$$ \mathbf{c}_3 = \bar{c}_3 = \bar{\mathbf{c}}_3 = \underline{c}_3 = \underline{\mathbf{c}}_3 = \vec{c}_3 = \vec{\mathbf{c}}_3. $$
Physicists will tell you that "column indices go in the cellar", hence subscripts. Alternatively, "row indices go in the roof" or superscripts. We can denote a matrix as a row of column vectors, and we can denote a matrix as a column of row vectors.
$$ B = \begin{bmatrix} \ \mathbf{b}_1 & \ \mathbf{b}_2 & \ \cdots & \ \mathbf{b}_n \end{bmatrix} = \begin{bmatrix} \ \mathbf{b}^1 \\ \ \mathbf{b}^2 \\ \ \vdots \\ \ \mathbf{b}^m \end{bmatrix} $$
Here each row vector $\mathbf{b}^k$ is an $1\times n$ row vector. We indicate the row vectors of matrix using the lower case letter. We indicate the symbol represents a vector by using bold face and/or placing a bar over or under the lower case letter. We indicate the specific column with a superscript. The third row of matrix $C$ is denoted
$$ \mathbf{c}^3 = \bar{c}^3 = \bar{\mathbf{c}}^3 = \underline{c}^3 = \underline{\mathbf{c}}^3 = \vec{c}^3 = \vec{\mathbf{c}}^3. $$
2.1.2 Matrices and Linear Transformations¶
Before we get into the details of matrix arithmetic (how to do the mechanical operations of adding matrices, multiplying them by scalars, and multiplying pairs of matrices), study Linear Transformations by Grant Sanderson. It mainly provides some visualizations of what the matrices we are going to be working with represent and provide context for the computations done in the rest of the section.
2.1.3 Element-by-Element Matrix Arithmetic¶
We want a way to represent a matrix as compactly as possible so that we can represent matrix and vector algebra using patterns instead of writing out all of the operations on every element of the matrix. For example, if we multiply a matrix or vector by a scalar, we mean for every element of the matrix or vector to be multiplied by the scalar value. If we want to multiply matrix $A$ by scalar $\gamma$ we have
$$ \gamma\cdot\begin{bmatrix} \ a_{11} & \ a_{12} & \ \cdots & \ a_{1n} \\ \ a_{21} & \ a_{22} & \ \cdots &\ a_{2n} \\ \ \ddots & \ \ddots & \ \cdots & \ \ddots \\ \ a_{m1} & \ a_{m2} & \ \cdots & \ a_{mn} \end{bmatrix} = \begin{bmatrix} \ \gamma a_{11} & \ \gamma a_{12} & \ \cdots & \ \gamma a_{1n} \\ \ \gamma a_{21} & \ \gamma a_{22} & \ \cdots &\ \gamma a_{2n} \\ \ \ddots & \ \ddots & \ \cdots & \ \ddots \\ \ \gamma a_{m1} & \ \gamma a_{m2} & \ \cdots & \ \gamma a_{mn} \end{bmatrix}. $$
Understanding the multiplication operation requires pattern matching. However, we want to simply write using mathematics, "Multiply every element of matrix $A$ by the scalar $\gamma$". When we want to denote an algebraic operation that should be exactly the same for every element of the matrix we denote the matrix $A$ using the lower case letter for every element of the matrix and its subscripts
$$ A = \left[ a_{jk} \right],\qquad 1\le j\le m,\ 1\le k\le n. $$
We use the normal font because each element of the matrix is a scalar. The index tells where each scalar appears in the rows and columns of the matrix. In this way we denote matrix-scalar multiplication of matrix $A$ by scalar $\gamma$ by
$$ \gamma A = \gamma\left[ a_{jk}\right] = \left[\gamma a_{jk}\right]. $$
We can denote many of the matrix algebra operations using this notation. Matrix addition of $m\times n$ matrices $A$ and $B$ can be denoted
$$ A + B = \left[a_{ij}\right] + \left[b_{ij}\right] = \left[ a_{ij} + b_{ij} \right]. $$
Here we mean, that to add to matrices that are both $m\times n$ you add each corresponding pair of elements in the matrices. Subtraction of $m\times n$ matrices $C$ and $D$ can be denoted
$$ C - D = \left[c_{ij}\right] - \left[d_{ij}\right] = \left[ c_{ij} - d_{ij}\right]. $$
When you complete your homework, quiz and exam questions use this notation to simplify writing the answers when you need to show that a simple algebraic operation follows the rules of algebra.
2.1.4 Scalar Multiplication¶
We begin our understand of matrix multiplication scalar multiplication. When we see a column vector or $n\times 1$ matrix $\mathbf{x}$, we usually think of it as a representation of vector in an $n$-dimensional space of vectors $\mathbb{R}^n$. For example a vector $\mathbf{x}\in\mathbb{R}^3$ such as
$$ \mathbf{x} = \begin{bmatrix}\ \ 2\ \\ \ \ 2\ \\ -1\ \end{bmatrix} $$
This can also be thought of as a representation of a linear transformation or function from $\mathbb{R}^1\rightarrow\mathbb{R}^3$. If you plug a real number into the function $\mathbf{x}$ you get out a vector in $\mathbb{R}^3$.
$$ \mathbf{x}(t) = t\mathbf{x} = \begin{bmatrix}\ \ 2t\ \\ \ \ 2t\ \\ -t\ \end{bmatrix} $$
The set of all possible outputs or scalar multiples of vector $\mathbf{x}$ is the span of the set $\left\{\,\mathbf{x}\,\right\}$ or a line.
2.1.5 Matrix-Vector Multiplication¶
Definition of Matrix-Vector multiplication¶
Multiplication of $m\times n$ matrix $A$ and $n\times 1$ vector $\mathbf{x}$ on the right yields another $n\times 1$ vector $\mathbf{y}$.
$$ \mathbf{y}=A\mathbf{x} $$
yields a linear combination of the columns of matrix $A$ given by
$$ A\mathbf{x} = \begin{bmatrix} \mathbf{a}_1 & \mathbf{a}_2 & \dots & \mathbf{a}_n \end{bmatrix}\,\begin{bmatrix} x_1 \\ x_2 \\ \ddots \\ x_n \end{bmatrix} = x_1\mathbf{a}_1 + x_2\mathbf{a}_2 + \cdots + x_n\mathbf{a}_n. $$
This should remind of you of dot product because it has the same definition. If we multiply a row vector $\mathbf{y}^T$ times a column vector $\mathbf{x}$ as matrices then the columns of $\mathbf{y}$ are just scalars and we obtain
$$ \mathbf{x}\cdot\mathbf{y} = \mathbf{y}^T\mathbf{x} = \begin{bmatrix} y_1 & y_2 & \dots & y_n \end{bmatrix}\,\begin{bmatrix} x_1 \\ x_2 \\ \ddots \\ x_n \end{bmatrix} = x_1y_1 + x_2y_2 + \dots + x_ny_n = \displaystyle\sum_{k=1}^n x_ky_k $$
How does this work?
Example 2.1.1¶
$$ \begin{align*} A\mathbf{x} &= \begin{bmatrix}\ \ 7\ &\ \ 4\ & -6\ \\ \ \ 6\ &\ \ 0\ &\ \ 3\ \\ \ \ 7\ &\ \ 1\ &\ \ 1\ \end{bmatrix}\,\begin{bmatrix}\ \ 1\ \\ -3\ \\ \ \ 2\ \end{bmatrix} = 1\begin{bmatrix}\ \ 7\ \\ \ \ 6\ \\ \ \ 7\ \end{bmatrix} + (-3)\begin{bmatrix}\ \ 4\ \\ \ \ 0\ \\ \ \ 1\ \end{bmatrix} + 2\begin{bmatrix} -6\ \\ \ \ 3\ \\ \ \ 1\ \end{bmatrix} \\ \\ &= \begin{bmatrix}\ \ 7\ \\ \ \ 6\ \\ \ \ 7\ \end{bmatrix} + \begin{bmatrix} -12\ \\ \ \ \ 0\ \\ \,-3\ \end{bmatrix} + \begin{bmatrix} -12\ \\ \ \ \ 6\ \\ \ \ \ 2\ \end{bmatrix} = \begin{bmatrix} -17\ \\ \ \ 12\ \\ \ \ 6\ \end{bmatrix} \end{align*} $$
This is not the way you were taught to compute matrix vector multiplication as a column of dot products.
Example 2.1.2¶
$$ \begin{align*} B\mathbf{y} &= \begin{bmatrix}\ \ 2\ & -7\ &\ \ 4\ \\ \ \ 1\ &\ \ 6\ &\ \ 7\ \\ \ \ 8\ & -6\ & -8\ \end{bmatrix}\,\begin{bmatrix} -2\ \\ \ \ \ 2\ \\ -1\ \end{bmatrix} = \begin{bmatrix} \mathbf{b}^1\mathbf{y} \\ \mathbf{b}^2\mathbf{y} \\ \mathbf{b}^3\mathbf{y} \end{bmatrix} \\ \\ &= \begin{bmatrix} 2(-2) + (-7)(2) + 4(-1) \\ 1(-2) + 6(2) + 7(-1) \\ 8(-2) + (-6)(2) + (-8)(-1) \end{bmatrix} = \begin{bmatrix} -22\ \\ \ \ \ 3\ \\ -20\ \end{bmatrix} \end{align*} $$
However it gives us the correct result in a way that helps us understand matrix-vector multiplication geometrically
$$ \begin{align*} B\mathbf{y} &= y_1\begin{bmatrix}\ \ 2\ \\ \ \ 1\ \\ \ \ 8\ \end{bmatrix} + y_2\begin{bmatrix}-7\ \\ \ \ 6\ \\ -6\ \end{bmatrix} + y_3\begin{bmatrix}\ \ 4\ \\ \ \ 7\ \\ -8\ \end{bmatrix} \end{align*} $$
This geometric understanding comes from the following matrix-vector multiplications
$$ \begin{align*} A\ihat &= \begin{bmatrix}\ \ 7\ &\ \ 4\ & -6\ \\ \ \ 6\ &\ \ 0\ &\ \ 3\ \\ \ \ 7\ &\ \ 1\ &\ \ 1\ \end{bmatrix}\,\begin{bmatrix}\ \ 1\ \\ \ \ 0\ \\ \ \ 0\ \end{bmatrix} = \begin{bmatrix}\ \ 7\ \\ \ \ 6\ \\ \ \ 7\ \end{bmatrix} = \mathbf{a}_1 \\ \\ A\jhat &= \begin{bmatrix}\ \ 7\ &\ \ 4\ & -6\ \\ \ \ 6\ &\ \ 0\ &\ \ 3\ \\ \ \ 7\ &\ \ 1\ &\ \ 1\ \end{bmatrix}\,\begin{bmatrix}\ \ 0\ \\ \ \ 1\ \\ \ \ 0\ \end{bmatrix} = \begin{bmatrix}\ \ 4\ \\ \ \ 0\ \\ \ \ 1\ \end{bmatrix} = \mathbf{a}_2 \\ \\ A\khat &= \begin{bmatrix}\ \ 7\ &\ \ 4\ & -6\ \\ \ \ 6\ &\ \ 0\ &\ \ 3\ \\ \ \ 7\ &\ \ 1\ &\ \ 1\ \end{bmatrix}\,\begin{bmatrix}\ \ 0\ \\ \ \ 0\ \\ \ \ 1\ \end{bmatrix} = \begin{bmatrix} -6\ \\ \ \ 3\ \\ \ \ 1\ \end{bmatrix} = \mathbf{a}_3 \end{align*} $$
Hence when we multiply matrix $A$ on the right by a column vector $\mathbf{x}$ we obtain
$$ A\mathbf{x} = A\left( x_1\ihat + x_2\jhat + x_3\khat \right) = x_1A\ihat + x_2A\jhat + x_3A\khat = x_1\mathbf{a}_1 + x_2\mathbf{a}_2 + x_3\mathbf{a}_3 $$
By the way notice that
$\mathbf{a}_i$ is a column of matrix $A$. Column coordinates are in the cellar
$\mathbf{a}^j$ is a row of matrix $A$. Row coordinates are in the roof.
Mathematicians call superscripts contravariant coordinates and subscripts covariant coordinates, so we will stick with the physics student perspective and call them roof and cellar.
2.1.6 Understanding Matrix Multiplication¶
Matrix Multiplication by Grant Sanderson shows us that correct way to think about matrix multiplication and helps us understand two basic facts about matrices.
Matrices¶
An $m\times n$ matrix $A$ is an algebraic representation of a linear transformation or function $A:\mathbb{R}^n$ to $\mathbb{R}^m$.
The product of two matrices is function composition.
Matrix Multiplication by Dr. Strang shows us four different mathematical visualizations of matrix multiplication. Your should view this video only until 21:16 on the time line of the video. We will study the rest of the video in the next section.
If $A$ is an $m\times n$ matrix and $B$ is an $n\times p$ matrix then we can perform $p$ matrix-vector multiplications to obtain $p$, $m\times 1$ column vectors. Remember that $B$ is a row of $p$, of $n\times 1$ column vectors
$$ B = \begin{bmatrix} \mathbf{b}_1 & \mathbf{b}_2 & \cdots & \mathbf{b}_p \end{bmatrix} $$
Definition of Matrix Multiplication¶
For $A\in\mathbb{R}^{m\times n}$ and $B\in\mathbb{R}^{n\times p}$ the matrix product of $A$ and $B$ is the matrix $C\in\mathbb{R}^{m\times p}$ so that
$$ C = AB = A\begin{bmatrix} \mathbf{b}_1 & \mathbf{b}_2 & \cdots & \mathbf{b}_p \end{bmatrix} = \begin{bmatrix} A\mathbf{b}_1 & A\mathbf{b}_2 & \cdots & A\mathbf{b}_p \end{bmatrix} $$
We simply repeat a matrix-vector multiplication $p$ times.
Example 2.1.3¶
Let
$$ A = \begin{bmatrix}\ \ 0\ & -1\ &\ \ 2\ \\ \ \ 1\ &\ \ 0\ &\ \ 0\ \\ \ \ 1\ & -2\ & -1\ \\ -2\ & -1\ &\ \ 2\ \end{bmatrix}\text{ and }B = \begin{bmatrix}\ \ 4\ & -2\ & -2\ \\ \ \ 4\ &\ \ 2\ &\ \ 2\ \\ \ \ 2\ &\ \ 3\ & -2\ \end{bmatrix} $$
The matrix product $C = AB$ is computed
$$ \begin{align*} C &= AB = A\begin{bmatrix} \mathbf{b}_1 & \mathbf{b}_2 & \mathbf{b}_3 \end{bmatrix} \\ \\ A\mathbf{b}_1 &= \begin{bmatrix}\ \ 0\ & -1\ &\ \ 2\ \\ \ \ 1\ &\ \ 0\ &\ \ 0\ \\ \ \ 1\ & -2\ & -1\ \\ -2\ & -1\ &\ \ 2\ \end{bmatrix}\begin{bmatrix}\ \ 4\ \\ \ \ 4\ \\ \ 2\ \end{bmatrix} = \begin{bmatrix}\ \ 0\ \\ \ \ 4\ \\ -6\ \\ -8\ \end{bmatrix} \\ \\ A\mathbf{b}_2 &= \begin{bmatrix}\ \ 0\ & -1\ &\ \ 2\ \\ \ \ 1\ &\ \ 0\ &\ \ 0\ \\ \ \ 1\ & -2\ & -1\ \\ -2\ & -1\ &\ \ 2\ \end{bmatrix}\begin{bmatrix} -2\ \\ \ \ 2\ \\ \ 3\ \end{bmatrix} = \begin{bmatrix}\ \ 4\ \\ -2\ \\ -9\ \\ \ \ 8\ \end{bmatrix} \\ \\ A\mathbf{b}_3 &= \begin{bmatrix}\ \ 0\ & -1\ &\ \ 2\ \\ \ \ 1\ &\ \ 0\ &\ \ 0\ \\ \ \ 1\ & -2\ & -1\ \\ -2\ & -1\ &\ \ 2\ \end{bmatrix}\begin{bmatrix} -2\ \\ \ \ 2\ \\ -2\ \end{bmatrix} = \begin{bmatrix} -6\ \\ -2\ \\ -4\ \\ -2\ \end{bmatrix} \\ \\ C &= \begin{bmatrix}\ \ 0\ &\ \ 4\ & -6\ \\ \ \ 4\ & -2\ & -2\ \\ -6\ & -9\ & -4\ \\ -8\ &\ \ 8\ & -2\ \end{bmatrix} \end{align*} $$
Example 2.1.4¶
This version of matrix multiplication shows its usefulness when the matrix on the right is simple.
Let
$$ A = \begin{bmatrix}\ \ 0\ & -1\ &\ \ 2\ \\ \ \ 1\ &\ \ 0\ &\ \ 0\ \\ \ \ 1\ & -2\ & -1\ \\ -2\ & -1\ &\ \ 2\ \end{bmatrix}\text{ and }B = \begin{bmatrix}\ \ 1\ &\ \ 0\ &\ \ 1\ \\ \ \ 0\ &\ \ 1\ &\ \ 2\ \\ \ \ 0\ &\ \ 0\ &\ \ 0\ \end{bmatrix} $$
Now the matrix product and be computed using the definition. The first column of the product is just the first column of $A$.
$$ \mathbf{c}_1 = A\mathbf{b}_1 = \begin{bmatrix}\ \ 0\ & -1\ &\ \ 2\ \\ \ \ 1\ &\ \ 0\ &\ \ 0\ \\ \ \ 1\ & -2\ & -1\ \\ -2\ & -1\ &\ \ 2\ \end{bmatrix}\begin{bmatrix}\ \ 1\ \\ \ \ 0\ \\ \ \ 0\ \end{bmatrix} = \begin{bmatrix}\ \ 0\ \\ \ \ 1\ \\ \ \ 1\ \\ -2\ \end{bmatrix} $$
The second column of the product is the second column of $A$.
$$ \mathbf{c}_2 = A\mathbf{b}_2 = \begin{bmatrix}\ \ 0\ & -1\ &\ \ 2\ \\ \ \ 1\ &\ \ 0\ &\ \ 0\ \\ \ \ 1\ & -2\ & -1\ \\ -2\ & -1\ &\ \ 2\ \end{bmatrix}\begin{bmatrix}\ \ 0\ \\ \ \ 1\ \\ \ \ 0\ \end{bmatrix} = \begin{bmatrix} -1\ \\ \ \ 0\ \\ -2\ \\ -1\ \end{bmatrix} $$
The third column of the product is $\mathbf{a}_1 + 2\mathbf{a}_2$.
$$ \mathbf{c}_3 = A\mathbf{b}_3 = A\mathbf{b}_2 = \begin{bmatrix}\ \ 0\ & -1\ &\ \ 2\ \\ \ \ 1\ &\ \ 0\ &\ \ 0\ \\ \ \ 1\ & -2\ & -1\ \\ -2\ & -1\ &\ \ 2\ \end{bmatrix}\begin{bmatrix}\ \ 1\ \\ \ \ 2\ \\ \ \ 0\ \end{bmatrix} = \begin{bmatrix} -2\ \\ \ \ 1\ \\ -3\ \\ -4\ \end{bmatrix} $$
Thus
$$ C = AB = \begin{bmatrix} A\mathbf{b}_1 & A\mathbf{b}_2 & A\mathbf{b}_3 \end{bmatrix} = \begin{bmatrix}\ \ 0\ & -1\ & -2 \\ \ \ 1\ &\ \ \ 0\ &\ \ \ 1\ \\ \ \ 1\ & -2\ & -3\ \\ -2\ & -1\ & -4\ \end{bmatrix} $$
Each column of $B$ needs to be an $n\times 1$ column vector because matrix $A$ has $n$ columns. The result $C$ is an $m\times p$ matrix because matrix $A$ is made up of $m\times 1$ column vectors ($A$ has $m$ rows), and matrix $B$ has $p$ columns.
2.1.7 Alternate Form of Matrix Multiplication¶
What if matrix $A$ is a $1\times n$ row vector? What does matrix-vector multiplication look like? If matrix $A$ has only one row then each column of matrix $A$ is an $1\times 1$ column vector.
$$ A\mathbf{x} = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n}\end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} = \begin{bmatrix} a_1 & a_2 & \cdots & a_n \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} = a_1x_1 + a_2x_2 + \cdots + a_nx_n. $$
Two subscripts for matrix $A$ are unnecessary when there is only one row. This gives the definition of vector-vector multiplication when the vector on the left is an $1\times n$ vector and the vector on the right is an $n\times 1$ vector. Notice the result is an $1\times 1$ vector, a scalar. This gives us dot product again,
$$\mathbf{x}\cdot\mathbf{y} = \mathbf{x}^T\mathbf{y} = \begin{bmatrix} x_1 & x_2 & \cdots & x_n \end{bmatrix}\begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_n \end{bmatrix} = x_1y_1 + x_2y_2 + \cdots + x_ny_n. $$
Vector-Matrix Multiplication¶
If we multiply an $1\times n$ matrix $A$ by an $n\times p$ matrix $B$ we have
$$\begin{align*}
AB &= \begin{bmatrix} a_1 & a_2 & \cdots & a_n \end{bmatrix}\begin{bmatrix} \mathbf{b}_1 & \mathbf{b}_2 & \cdots & \mathbf{b}_p \end{bmatrix} \\
\\
&= \begin{bmatrix} \mathbf{a}^1 \end{bmatrix}\begin{bmatrix} \mathbf{b}_1 & \mathbf{b}_2 & \cdots & \mathbf{b}_p \end{bmatrix} \\
\\
&= \begin{bmatrix} \mathbf{a}^1\mathbf{b}_1 & \mathbf{a}^1\mathbf{b}_2 & \cdots & \mathbf{a}^1\mathbf{b}_p \end{bmatrix} \\
\\
&= \left[ a_1b_{11} + a_2b_{21} + \cdots + a_nb_{n1} \quad a_1b_{12} + a_2b_{22} + \cdots + a_nb_{n2} \ \cdots\ a_1b_{1p} + a_2b_{2p} + \cdots + a_nb_{np} \right] \\
\\
&= \left[ a_1\left[\begin{array}{c} b_{11} & b_{12} & \cdots & b_{1p} \end{array}\right] + a_2\left[\begin{array}{c} b_{21} & b_{22} & \cdots & b_{2p} \end{array}\right] +\cdots+ a_n\left[\begin{array}{c} b_{n1} & b_{n2} & \cdots & b_{np} \end{array}\right] \right] \\
\\
&= \left[ a_1\mathbf{b}^1 + a_2\mathbf{b}^2 + \cdots + a_n\mathbf{b}^n \right]
\end{align*}$$
Multiply a matrix on the left by a row vector give is a linear combination of the rows of the matrix!
Example 2.1.5¶
$$ \begin{bmatrix} -2 & 0 & 1 \end{bmatrix}\begin{bmatrix} 1 & 3 & 0 \\ 2 & -1 & 7 \\ 2 & 1 & 0 \end{bmatrix} = \begin{bmatrix} -2 & 0 & 1 \end{bmatrix}\begin{bmatrix} \mathbf{b}^1 \\ \mathbf{b}^2 \\ \mathbf{b}^3 \end{bmatrix} = -2\mathbf{b}^1 + 0\mathbf{b}^2 + \mathbf{b}^3 = \begin{bmatrix} 0 & -5 & 0 \end{bmatrix}. $$
That is
$$ -2\begin{bmatrix} 1 & 3 & 0 \end{bmatrix} + 0\begin{bmatrix} 2 & -1 & 7 \end{bmatrix} + 1\begin{bmatrix} 2 & 1 & 0 \end{bmatrix} $$
$$= \begin{bmatrix} -2 & -6 & 0 \end{bmatrix} + \begin{bmatrix} 2 & 1 & 0 \end{bmatrix} = \begin{bmatrix} 0 & -5 & 0 \end{bmatrix}.$$
We can describe this as an elementary row operation
We can think of a matrix as a column of row vectors.
$$ A = \begin{bmatrix} \mathbf{a}^1 \\ \mathbf{a}^2 \\ \vdots \\ \mathbf{a}^m \end{bmatrix} $$
Alternate Form of Matrix Multiplication¶
If $A\in\mathbb{R}^{m\times n}$ and $B\in\mathbb{R}^{n\times p}$, then $C\in\mathbb{R}^{m\times p}$ so that
$$ C = AB = \begin{bmatrix} \mathbf{a}^1 \\ \mathbf{a}^2 \\ \vdots \\ \mathbf{a}^m \end{bmatrix}B = \begin{bmatrix} \mathbf{a}^1B \\ \mathbf{a}^2B \\ \vdots \\ \mathbf{a}^mB \end{bmatrix}, $$
where the $k^{\text{th}}$ row of the product is the product of a row vector $\mathbf{a}^k$ times matrix $B$. When something like this happens in mathematics, we call it duality.
Example 2.1.6¶
Let $A = \begin{bmatrix}\ \ 1\ &\ \ 0\ &\ \ 0\ \\ \ \ 4\ &\ \ 1\ &\ \ 0\ \\ \ \ 3\ & -2\ &\ \ 1\ \end{bmatrix}$ and $B = \begin{bmatrix} -4\ &\ \ 4\ & -1\ \\ \ \ \ 3\ &\ \ 0\ &\ \ 4\ \\ \ \ \ 0\ &\ \ 0\ & -3\ \end{bmatrix}$.
$$ \begin{align*} C &= AB = \begin{bmatrix} \mathbf{a}^1 \\ \mathbf{a}^2 \\ \mathbf{a}^3 \end{bmatrix}B \\ \\ \mathbf{a}^1 B &= \begin{bmatrix}\ \ 1\ &\ \ 0\ &\ \ 0\ \end{bmatrix}B = \begin{bmatrix} -4\ &\ \ 4\ & -1\ \end{bmatrix} \\ \\ \mathbf{a}^2 B &= \begin{bmatrix}\ \ 4\ &\ \ 1\ &\ \ 0\ \end{bmatrix}B = \begin{bmatrix} -13\ &\ 16\ &\ \ 0\ \end{bmatrix} \\ \\ \mathbf{a}^3B &= \begin{bmatrix}\ \ 3\ & -2\ &\ \ 1\ \end{bmatrix}B = \begin{bmatrix} -18\ &\ 12\ & -14\ \end{bmatrix} \\ \\ C &= \begin{bmatrix} -4\ &\ \ 4\ & -1\ \\ -13\ &\ 16\ &\ \ \ 0\ \\ -18\ &\ 12\ & -14\ \end{bmatrix} \end{align*} $$
2.1.8 Naive Form of Matrix Multiplication¶
Whether we view matrix multiplication as a row of matrix-vector multiplications or a column of row vector-matrix multiplications, we can look at each element of the product matrix
Dot Product Form of Matrix Multiplication¶
$$ \begin{align*} C &= AB \\ \\ &= \begin{bmatrix} \mathbf{a}^1 \\ \mathbf{a}^2 \\ \vdots \\ \mathbf{a}^m \end{bmatrix}\begin{bmatrix} \mathbf{b}_1 & \mathbf{b}_2 & \cdots & \mathbf{b}_p \end{bmatrix} \\ \\ &= \begin{bmatrix} \mathbf{a}^i\mathbf{b}_j \end{bmatrix} = \begin{bmatrix} c_{ij} \end{bmatrix} \\ \end{align*} $$
Each element $c_{ij}$ of the product matrix $C$ is the matrix product of a $1\times n$ row vector $\mathbf{a}^i$ and a $n\times 1$ column vector $\mathbf{b}_j$.
Example 2.1.7¶
Let $A = \begin{bmatrix}\ \ 2\ & -1\ \\ -1\ &\ \ 4\ \end{bmatrix}$ and $B = \begin{bmatrix} -1\ &\ \ 0\ \\ \ 11\ & -14\ \end{bmatrix}$
Then
$$ \begin{align*} c_{11} &= \mathbf{a}^1\mathbf{b}_1= \begin{bmatrix}\ \ 2\ & -1 \end{bmatrix}\begin{bmatrix} -1\ \\ \ \ 11\ \end{bmatrix} = -13 \\ \\ c_{21} &= \mathbf{a}^2\mathbf{b}_1 = \begin{bmatrix} -1\ &\ \ 4\ \end{bmatrix}\begin{bmatrix} -1\ \\ \ \ 11\ \end{bmatrix} = 45 \\ \\ c_{12} &= \mathbf{a}^1\mathbf{b}_2 = \begin{bmatrix}\ \ 2\ & -1 \end{bmatrix}\begin{bmatrix}\ \ 0\ \\ -14\ \end{bmatrix} = 14 \\ \\ c_{22} &= \mathbf{a}^2\mathbf{b}_2 = \begin{bmatrix} -1\ &\ \ 4\ \end{bmatrix}\begin{bmatrix}\ \ 0\ \\ -14\ \end{bmatrix} = -56 \\ \\ C &= \begin{bmatrix} -13\ &\ \ 14\ \\ \ \ 45\ & -56\ \end{bmatrix} \end{align*} $$
Finally we get to the version of multiplying two matrices that students usually taught before their education in linear algebra.
The Element Form of Matrix Multiplication¶
$$ \begin{align*} C &= AB \\ \\ \begin{bmatrix} c_{ij} \end{bmatrix} &= \begin{bmatrix} \mathbf{a}^i\mathbf{b}_j \end{bmatrix} = \begin{bmatrix} \displaystyle\sum_{k=1}^n a_{ik}b_{kj} \end{bmatrix} \\ \\ c_{ij} &= \displaystyle\sum_{k=1}^n a_{ik}b_{kj} \end{align*} $$
Everyone must learn, understand and utilize all five visualizations of matrix multiplication in this course and in their careers. Each can be used in every problem. However, like a mechanics tool set, even though one can use pliers for any connector, there are advantages to each tool that make solution of the problem at hand easier to understand and compute.
2.1.9 Higher Dimensional Linear Transformations¶
If matrix $A$ is an $m\times n$ matrix, then $A$ is an element of the vector space of all $m\times n$ matrices of real numbers
$$ A\in\mathbb{R}^{m\times n} $$
We see that an $m\times n$ matrix $A$ represents a linear transformation from $n$-dimensional space to $m$-dimensional space.
$$ A:\mathbb{R}^n\rightarrow\mathbb{R}^m $$
where for every input $\mathbf{x}\in\mathbb{R}^n$ we get an output $\mathbf{y}\in\mathbb{R}^m$ such that
$$ \mathbf{y} = A\mathbf{x} $$
This allows FIVE different views of matrix multiplication:
Function Composition
The definition of matrix multiplication, a row of matrix-vector products $AB = \begin{bmatrix} A\mathbf{b}_1 & A\mathbf{b}_2 & \cdots & A\mathbf{b}_p \end{bmatrix}$
A column of vector-matrix products $AB = \begin{bmatrix} \mathbf{a}^1B \\ \mathbf{a}^2B \\ \ddots \\ \mathbf{a}^mB \end{bmatrix}$
A matrix of products $c_{ij} = \mathbf{a}^i\mathbf{b}_j$
The element-wise definition $c_{ij} = \displaystyle\sum_{k=1}^n a_{ik}b_{kj} = \displaystyle\sum_{k=1}^n a_k^ib_j^k = b_ja^i$
The last expression is called Einstein notation. In Einstein notation the appearance one factor with a covariant index (superscript, roof) and one with a contravariant index (subscript, cellar) implies summation
$$ b_ja^i := \displaystyle\sum_{k=1}^n b_j^ka_k^i = \displaystyle\sum_{k=1}^n a_{ik}b_{kj} $$
This notation is motivated by the fact that the index $k$ is a dummy index since it may be replaced by any symbol not already in use and not change the meaning of the expression. For this reason the index is also called a free index.
2.1.10 Exercises¶
Exercise 1¶
Compute the matrix $C = AB$ where $A = \begin{bmatrix}\ \ 3\ & -4\ \\ \ \ 0\ &\ \ 2\ \end{bmatrix}$ and $B = \begin{bmatrix}\ \ 2\ &\ \ 0\ \\ \ \ 0\ &\ \ 4\ \end{bmatrix}$.
View Solution
As matrix $B$ is the simplest matrix, so let us multiply matrix $A$ on the right by the two columns of matrix $B$ to obtain the two columns of the product matrix $C$.
$$ \begin{align*} C &= \begin{bmatrix}\ \ {\color{darkblue} 3}\ & {\color{softmagenta} -4}\ \\ \ \ {\color{darkblue} 0}\ &\ \ {\color{softmagenta} 2}\ \end{bmatrix}\begin{bmatrix}\ \ {\color{purple} 2}\ &\ \ {\color{red} 0}\ \\ \ \ {\color{purple} 0}\ &\ \ {\color{red} 4}\ \end{bmatrix} \\ \\ \mathbf{c}_1 &= {\color{purple} 2}{\color{darkblue} \mathbf{a}_1} + {\color{red} 0}{\color{softmagenta}\mathbf{a}_2} = \begin{bmatrix}\ \ {\color{darkblue} 6}\ \\ \ \ {\color{darkblue} 0} -2\ & -2\ & -3\ \end{bmatrix}\qquad &\begin{array}{c} \text{Two times column one} \\ \text{plus none of column two} \end{array} \\ \\ \mathbf{c}_2 &= {\color{purple} 0}{\color{darkblue} \mathbf{a}_1} + {\color{red} 4}{\color{softmagenta} \mathbf{a}_2} = \begin{bmatrix}{\color{softmagenta} -16} \\ \ \ {\color{softmagenta} 8}\ \end{bmatrix}\qquad &\begin{array}{c} \text{none of column one plus} \\ \text{four times column two} \end{array} \\ \\ C &= \begin{bmatrix}\ \ {\color{darkblue} 6}\ &{\color{softmagenta} -16} \\ \ \ {\color{darkblue} 0}\ &\ \ \ {\color{softmagenta} 8}\ \end{bmatrix} \end{align*} $$
Exercise 2¶
Compute the matrix $G = EF$ where $E = \begin{bmatrix}\ \ 0\ & -1\ \\ \ \ 1\ &\ \ 1\ \\ \ \ 1\ &\ \ 1\ \end{bmatrix}$ and $F = \begin{bmatrix}\ \ 1\ & -4\ & -1 \\ -3\ &\ \ 2\ & -2\ \end{bmatrix}$
View Solution
As matrix $E$ is the simplest matrix, so let us multiply matrix $F$ on the left by the three rows of matrix $E$ to obtain the three rows of the product matrix $C$.
$$ \begin{align*} G &= EF = \begin{bmatrix}\ \ 0\ & -1\ \\ \ \ 1\ &\ \ 1\ \\ \ \ 1\ &\ \ 1\ \end{bmatrix}\begin{bmatrix}\ \ 1\ & -4\ & -1 \\ -3\ &\ \ 2\ & -2\ \end{bmatrix} \\ \\ \mathbf{g}^1 &= 0\mathbf{f}^1 - \mathbf{f}^2 = \begin{bmatrix}\ \ 3\ & -2\ &\ \ 2\ \end{bmatrix} \qquad &\begin{array}{c} \text{none of row one} \\ \text{minus row two} \end{array} \\ \\ \mathbf{g}^2 &= \mathbf{f}^1 + \mathbf{f}^2 = \begin{bmatrix} -2\ & -2\ & -3\ \end{bmatrix} \qquad &\begin{array}{c} \text{row one plus} \\ \text{row two} \end{array} \\ \\ \mathbf{g}^3 &= \mathbf{g}^2 = \begin{bmatrix} -2\ & -2\ & -3\ \end{bmatrix} \\ \\ G &= \begin{bmatrix}\ \ 3\ & -2\ &\ \ 2\ \\ -2\ & -2\ & -3\ \\ -2\ & -2\ & -3\ \end{bmatrix} \end{align*} $$
Exercise 3¶
Compute the matrix product $C = AB$ where
$$
A = \begin{bmatrix}
\ \ 1\ &\ \ 2\ &\ \ 0\ & -1\ \\
\ \ 5\ &\ \ 4\ & -6\ &\ \ 1\ \\
\ \ 0\ &\ \ 4\ &\ \ 4\ & -4\ \\
\ \ 2\ &\ \ 1\ & -3\ &\ \ 1\ \end{bmatrix},\qquad\qquad
B = \begin{bmatrix}
\ \ 0\ &\ \ 1\ &\ \ 0\ \\
-1\ &\ \ 1\ & -1\ \\
-1\ & -1\ &\ \ 0\ \\
\ \ 0\ &\ \ 0\ & -1\ \end{bmatrix}
$$
View Solution
As matrix $B$ is the simplest one let us multiply matrix $A$ on the right by the three columns of matrix $B$ to obtain the three columns of the product matrix $C$.
$$ \begin{align*} A\mathbf{b}_1 &= \begin{bmatrix} \ \ 1\ &\ \ 2\ &\ \ 0\ & -1\ \\ \ \ 5\ &\ \ 4\ & -6\ &\ \ 1\ \\ \ \ 0\ &\ \ 4\ &\ \ 4\ & -4\ \\ \ \ 2\ &\ \ 1\ & -3\ &\ \ 1\ \end{bmatrix} \begin{bmatrix}\ \ 0\ \\ -1\ \\ -1\ \\ \ \ 0\ \end{bmatrix} = \begin{bmatrix} -2\ \\ \ \ 2 \\ -8\ \\ \ \ 2\ \end{bmatrix} \qquad &\begin{array}{c} \text{the negative of the} \\ \text{sum of column 2 and column 3} \end{array}\\ \\ A\mathbf{b}_2 &= \begin{bmatrix} \ \ 1\ &\ \ 2\ &\ \ 0\ & -1\ \\ \ \ 5\ &\ \ 4\ & -6\ &\ \ 1\ \\ \ \ 0\ &\ \ 4\ &\ \ 4\ & -4\ \\ \ \ 2\ &\ \ 1\ & -3\ &\ \ 1\ \end{bmatrix} \begin{bmatrix}\ \ 1\ \\ \ \ 1\ \\ -1\ \\ \ \ 0\ \end{bmatrix} = \begin{bmatrix}\ \ 3\ \\ \ 15\ \\ \ \ 0\ \\ \ \ 6\ \end{bmatrix} \qquad &\begin{array}{c} \text{the sum of column 1 and} \\ \text{column 2 minus column 3} \end{array} \\ \\ A\mathbf{b}_3 &= \begin{bmatrix} \ \ 1\ &\ \ 2\ &\ \ 0\ & -1\ \\ \ \ 5\ &\ \ 4\ & -6\ &\ \ 1\ \\ \ \ 0\ &\ \ 4\ &\ \ 4\ & -4\ \\ \ \ 2\ &\ \ 1\ & -3\ &\ \ 1\ \end{bmatrix} \begin{bmatrix}\ \ 0\ \\ -1\ \\ \ \ 0\ \\ -1\ \end{bmatrix} = \begin{bmatrix} -1\ \\ -5\ \\ \ \ 0\ \\ -2\ \end{bmatrix} \qquad &\begin{array}{c} \text{the negative of the} \\ \text{sum of column 2 and column 4} \end{array} \\ \\ C &= AB = \begin{bmatrix} -2\ &\ \ 3\ & -1\ \\ \ \ 2\ &\ 15\ & -5\ \\ -8\ &\ \ 0\ &\ \ 0\ \\ \ \ 2\ &\ \ 6\ & -2\ \end{bmatrix}\\ \\ \end{align*} $$
Exercise 4¶
Compute the matrix product $D = BA$ where
$$
A = \begin{bmatrix}
\ \ 1\ &\ \ 2\ &\ \ 0\ & -1\ \\
\ \ 5\ &\ \ 4\ & -6\ &\ \ 1\ \\
\ \ 0\ &\ \ 4\ &\ \ 4\ & -4\ \end{bmatrix},\qquad\qquad
B = \begin{bmatrix}
\ \ 0\ &\ \ 1\ &\ \ 0\ \\
-1\ &\ \ 1\ & -1\ \\
-1\ & -1\ &\ \ 0\ \\
\ \ 0\ &\ \ 0\ & -1\ \end{bmatrix}
$$
View Solution
As the matrix on the left of the matrix product is simpler we multiply matrix $A$ on the left by the rows of matrix $B$.
$$ \begin{align*} \mathbf{b}^1A &= \begin{bmatrix}\ \ 0\ &\ \ 1\ &\ \ 0\ \end{bmatrix} \begin{bmatrix} \ \ 1\ &\ \ 2\ &\ \ 0\ & -1\ \\ \ \ 5\ &\ \ 4\ & -6\ &\ \ 1\ \\ \ \ 0\ &\ \ 4\ &\ \ 4\ & -4\ \end{bmatrix} = \begin{bmatrix}\ \ 5\ &\ \ 4\ & -6\ &\ \ 1\ \end{bmatrix} \qquad \begin{array}{l} \text{only row 2} \end{array} \\ \\ \mathbf{b}^2A &= \begin{bmatrix} -1\ &\ \ 1\ & -1\ \end{bmatrix} \begin{bmatrix} \ \ 1\ &\ \ 2\ &\ \ 0\ & -1\ \\ \ \ 5\ &\ \ 4\ & -6\ &\ \ 1\ \\ \ \ 0\ &\ \ 4\ &\ \ 4\ & -4\ \end{bmatrix} = \begin{bmatrix} \ \ 4\ & -2\ & -10\ &\ \ 6\ \end{bmatrix} \qquad \begin{array}{l} \text{row 2 minus} \\ \text{row 1 minus} \\ \text{row 3} \end{array} \\ \\ \mathbf{b}^3A &= \begin{bmatrix} -1\ & -1\ &\ \ 0\ \end{bmatrix} \begin{bmatrix} \ \ 1\ &\ \ 2\ &\ \ 0\ & -1\ \\ \ \ 5\ &\ \ 4\ & -6\ &\ \ 1\ \\ \ \ 0\ &\ \ 4\ &\ \ 4\ & -4\ \end{bmatrix} = \begin{bmatrix} -6\ & -6\ &\ \ 6\ &\ \ 0\ \end{bmatrix} \qquad \begin{array}{l} \text{the negative of} \\ \text{the sum of} \\ \text{row 1 and row 2} \end{array} \\ \\ \mathbf{b}^4A &= \begin{bmatrix}\ \ 0\ &\ \ 0\ & -1\ \end{bmatrix} \begin{bmatrix} \ \ 1\ &\ \ 2\ &\ \ 0\ & -1\ \\ \ \ 5\ &\ \ 4\ & -6\ &\ \ 1\ \\ \ \ 0\ &\ \ 4\ &\ \ 4\ & -4\ \end{bmatrix} = \begin{bmatrix}\ \ 0\ & -4\ & -4\ &\ \ 4\ \end{bmatrix} \qquad \begin{array}{l} \text{the negative of} \\ \text{row 3} \end{array} \\ \\ C &= BA = \begin{bmatrix} \ \ 5\ &\ \ 4\ & -6\ &\ \ 1\ \\ \ \ 4\ & -2\ & -10\ &\ \ 6\ \\ -6\ & -6\ &\ \ 6\ &\ \ 0\ \\ \ \ 0\ & -4\ & -4\ &\ \ 4\ \end{bmatrix} \\ \end{align*} $$
Your use of this self-initiated mediated course material is subject to our Creative Commons License 4.0