$\require{color}$ $\definecolor{brightblue}{rgb}{.267, .298, .812}$ $\definecolor{darkblue}{rgb}{0.0, 0.0, 1.0}$ $\definecolor{palepink}{rgb}{1, .73, .8}$ $\definecolor{softmagenta}{rgb}{.99,.34,.86}$ $\def\ihat{\mathbf{\hat{\unicode{x0131}}}}$ $\def\jhat{\mathbf{\hat{\unicode{x0237}}}}$ $\def\khat{\mathbf{\hat{k}}}$ $\def\tombstone{\unicode{x220E}}$
Up until this point, we have primarily discussed vectors and matrices in $\mathbb{R}^n$, with occasional reference to the possibility of working with complex values. It is now time to address complex values directly. For vectors, we will consider $\mathbb{C}^n$ where vectors $\mathbf{z}\in\mathbb{C}^n$ have the form
$$ \mathbf{z} = \begin{bmatrix} z_1 \\ z_2 \\ \vdots \\ z_n \end{bmatrix} = \begin{bmatrix} x_1 + y_1 i \\ x_2 + y_2 i \\ \vdots \\ x_n + y_n i \end{bmatrix} $$
with $x_j,y_j\in\mathbb{R}$ for all $j=1,2,\ldots,n$ and $i = \sqrt{-1}$ is the imaginary unit. The scalar field for complex-valued vectors will be $\mathbb{C}$.
If $\alpha = a + bi$ is a complex-valued scalar, its length is given by
$$ \left|\alpha\right| = \sqrt{\bar{\alpha}\alpha} = \sqrt{a^2 + b^2} $$
where $\bar{\alpha} = a - bi$. The length of a vector
$$ \mathbf{z} = \begin{bmatrix} z_1 \\ z_2 \\ \vdots \\ z_n \end{bmatrix} $$
is given by
$$ \begin{align*}
\left\| \mathbf{z} \right\| &= \left(|z_1|^2 + |z_2|^2 + \ldots + |z_n|^2\right)^\frac{1}{2} \\
\\
&= \left(\bar{z}_1z_1 + \bar{z}_2 z_2 + \ldots + \bar{z}_n z_n\right)^\frac{1}{2} \\
\\
&= \left(\bar{\mathbf{z}}^T \mathbf{z}\right)^\frac{1}{2}
\end{align*} $$
For convenience, we define the symbol $\mathbf{z}^H$ to be the
conjugate transpose
or
Hermitian transpose
of the vector $\mathbf{z}$, so
$$ \bar{\mathbf{z}}^T = \mathbf{z}^H \qquad\qquad\qquad \|\mathbf{z}\| = \left(\mathbf{z}^H\mathbf{z}\right)^\frac{1}{2} $$
Definition of Complex Inner Product ¶
Let $V$ be a vector space over the complex numbers. An inner product on $V$ is a operation that assigns to each pair of vectors $\mathbf{z},\mathbf{w}\in V$ a complex number $\langle \mathbf{z},\mathbf{w}\rangle$ satisfying the conditions
I. $\langle \mathbf{z},\mathbf{z} \rangle \geq 0$, and $\langle \mathbf{z},\mathbf{z} \rangle = 0$ if and only if $\,\mathbf{z} = \mathbf{0}$.
II. $\langle \mathbf{z},\mathbf{w} \rangle = \overline{\langle \mathbf{w},\mathbf{z} \rangle}$ for all $\,\mathbf{z},\mathbf{w}\in V$.
III. $\langle \alpha\mathbf{z} + \beta\mathbf{w},\mathbf{u} \rangle = \alpha\langle\mathbf{z},\mathbf{u}\rangle + \beta\langle\mathbf{w},\mathbf{u}\rangle$.
Note that condition II states a complex inner product is conjugate symmetric instead of merely symmetric. It will require some simple changes, but taking this fact into account will allows use our previous theorems on real inner product spaces for complex inner product spaces. It is particularly useful for us to recover the theorems leading to Parseval's Identity , so lets restate those with an complex valued inner product:
Theorem 5.3.2 ¶
The Scalar Projection of a Coordinate Vector onto an Orthonormal Basis Vector is the Coordinate
Let $\left\{\mathbf{u}_1,\mathbf{u}_2,\ldots,\mathbf{u}_n\right\}$ be an orthonormal basis for a complex inner product space $V$. If
$$ \mathbf{z} = \sum_{i=1}^n c_i\mathbf{w}_i = c_1\mathbf{w}_1 + c_2\mathbf{w}_2 + \ldots + c_n\mathbf{w}_n $$
then
$$ c_i = \langle \mathbf{z},\mathbf{w}_i \rangle \qquad\qquad\qquad \bar{c}_i = \langle \mathbf{w}_i,\mathbf{z} \rangle $$
Corollary 5.3.3 ¶
The Inner Product of Coordinate Vectors in an Orthonormal Basis is the Sum of the Component-wise Product
Let $\left\{\mathbf{w}_1,\mathbf{w}_2,\ldots,\mathbf{w}_n\right\}$ be an orthonormal basis for a complex inner product space $V$. If
$$ \mathbf{z} = \sum_{i=1}^n c_i\mathbf{w}_i \qquad\qquad \mathbf{w} = \sum_{i=1}^n w_i\mathbf{w}_i $$
then
$$ \langle \mathbf{z},\mathbf{w} \rangle = \sum_{i=1}^n c_i \bar{w}_i \qquad\qquad \langle \mathbf{w},\mathbf{z} \rangle = \sum_{i=1}^n w_i \bar{c}_i $$
Corollary 5.3.4 ¶
Parseval's Identity
If $\left\{\mathbf{u}_1,\mathbf{u}_2,\ldots,\mathbf{u}_n\right\}$ is an orthonormal basis for a complex inner product space $V$ and
$$ \mathbf{z} = \sum_{i=1}^n c_i\mathbf{w}_i $$
then
$$ \|\mathbf{z}\|^2 = \langle \mathbf{z},\mathbf{z} \rangle = \sum_{i=1}^n c_i\bar{c}_i $$
These theorems all use the inner product defined by
$$ \langle \mathbf{z},\mathbf{w} \rangle = \mathbf{w}^H \mathbf{z} $$
for all $\mathbf{z},\mathbf{w}\in\mathbb{C}^n$. Most of what we know about inner product spaces, including the modified theorems above, comes with little effort from our knowledge of real inner product spaces if we mind the following adjustments:
$ \mathbb{R}^n $ | $\qquad$ | $ \mathbb{C}^n $ |
---|---|---|
$\langle \mathbf{x},\mathbf{y} \rangle = \mathbf{y}^T\mathbf{x}$ | $\langle \mathbf{z},\mathbf{w} \rangle = \mathbf{w}^H\mathbf{z}$ | |
$\mathbf{x}^T\mathbf{y} = \mathbf{y}^T\mathbf{x}$ | $\mathbf{z}^H\mathbf{w} = \overline{\mathbf{w}^H\mathbf{z}}$ | |
$|\mathbf{x}|^2 = \mathbf{x}^T\mathbf{x}$ | $|\mathbf{z}|^2 = \mathbf{z}^H\mathbf{z}$ |
Given
$$ \mathbf{z} = \begin{bmatrix} 1 + 3i \\ -2 -i \end{bmatrix} \qquad\qquad\qquad \mathbf{w} = \begin{bmatrix} 3 + 2i \\ 5 - i \end{bmatrix} $$
compute
Let $Z\in\mathbb{C}^{m\times n}$ define a $m\times n$ matrix comprised of complex numbers. We can write each entry of $Z$ according to its component form $[z_{ij}]$ where each $z_{ij} = a_{ij} + ib_{ij}$ for $a_{ij},b_{ij}\in\mathbb{R}$.
Note: The subscript $i$ is the index for the row and the regular script $i$ is the imaginary unit $i=\sqrt{-1}$.
Using the definition of matrix addition, it is possible for us to write $Z$ as the sum
$$ Z = [z_{ij}] = [a_{ij} + ib_{ij}] = [a_{ij}] + i[b_{ij}] = A + iB $$
where $A = [a_{ij}]$ and $B = [b_{ij}]$ are both in $\mathbb{R}^{m\times n}$. This allows us to define the conjugate of the matrix $Z$ as
$$ \overline{Z} = A - iB $$
Simply put, this works in the way we are accustomed to conjugates operating, just negate all of the imaginary terms. For complex-valued matrices, we also need to make sure that we use the conjugate transpose, so $Z^H = \overline{Z}^T$ just as with complex vectors. The conjugate transpose follows rules similar to what we are familiar with for the typical transpose
Definition of Hermitian matrix ¶
A matrix $Z$ is called Hermitian if $Z = Z^H$.
Both
$$ Z = \begin{bmatrix} -1 & 2 - 2i \\ 2 + 2i & 3 \end{bmatrix} \qquad\qquad\qquad
W = \begin{bmatrix} 3 & 5 + i & 4 - 2i \\ 5 - i & -7 & -3 + 2i \\ 4 + 2i & -3 - 2i & 2 \end{bmatrix} $$
are Hermitian since
$$ \begin{align*}
Z^H = \begin{bmatrix} \overline{-1} & \overline{2 + 2i} \\ \overline{2 - 2i} & \overline{3} \end{bmatrix} &= \begin{bmatrix} -1 & 2 - 2i \\ 2 + 2i & 3 \end{bmatrix} = Z \\
\\
W^H = \begin{bmatrix} \overline{3} & \overline{5 - i} & \overline{4 + 2i} \\ \overline{5 + i} & \overline{-7} & \overline{-3 - 2i} \\ \overline{4 - 2i} & \overline{-3 + 2i} & \overline{2} \end{bmatrix} &= \begin{bmatrix} 3 & 5 + i & 4 - 2i \\ 5 - i & -7 & -3 + 2i \\ 4 + 2i & -3 - 2i & 2 \end{bmatrix} = W
\end{align*} $$
Theorem 7.3.1 ¶
Hermitian matrices have real eigenvalues and eigenvectors corresponding to distinct eigenvalues are orthogonal.
Suppose $A$ is a Hermitian matrix, and let $\lambda$ be an eigenvalue of $A$ with corresponding eigenvector $\mathbf{x}$. Set the scalar $\alpha = \mathbf{x}^H\! A\mathbf{x}$, then
$$ \bar{\alpha} = \alpha^H = \left(\mathbf{x}^H\! A\mathbf{x}\right)^H = \mathbf{x}^H A^H \left(\mathbf{x}^H\right)^H = \mathbf{x}^H\! A\mathbf{x} = \alpha $$
So $\alpha\in\mathbb{R}$. Also,
$$ \alpha = \mathbf{x}^H\! A\mathbf{x} = \mathbf{x}^H\! \lambda\mathbf{x} = \lambda\mathbf{x}^H \mathbf{x} = \lambda\|\mathbf{x}\|^2 $$
which implies that $\lambda\in\mathbb{R}$ since
$$ \lambda = \dfrac{\alpha}{\|\mathbf{x}\|^2} $$
Next, we want to show that eigenvectors for distinct eigenvalues are orthogonal, so let $\mathbf{x}_1$ and $\mathbf{x}_2$ be eigenvectors for $\lambda_1$ and $\lambda_2$, respectively. The expressions
$$ \begin{align*}
\left(A\mathbf{x}_1\right)^H \mathbf{x}_2 = \mathbf{x}_1^H A^H \mathbf{x}_2 &= \mathbf{x}_1^H A \mathbf{x}_2 = \lambda_2 \mathbf{x}_1^H \mathbf{x}_2 \\
\\
\left(A\mathbf{x}_1\right)^H \mathbf{x}_2 = \left(\mathbf{x}_2^H A\mathbf{x}_1\right)^H &= \left(\lambda_1\mathbf{x}_2^H \mathbf{x}_1\right)^H = \lambda_1 \mathbf{x}_1^H \mathbf{x}_2
\end{align*} $$
show that
$$
\begin{align*}
\lambda_1 \mathbf{x}_1^H \mathbf{x}_2 &= \lambda_2 \mathbf{x}_1^H \mathbf{x}_2 \\
\\
\lambda_1 \mathbf{x}_1^H \mathbf{x}_2 - \lambda_2 \mathbf{x}_1^H \mathbf{x}_2 &= 0 \\
\\
\left(\lambda_1 - \lambda_2\right)\, \langle \mathbf{x}_2, \mathbf{x}_1 \rangle &= 0 \\
\end{align*}
$$
which can only mean that $\langle \mathbf{x}_2, \mathbf{x}_1 \rangle = 0$, since $\lambda_1 \neq \lambda_2$. Therefore $\mathbf{x}_1$ and $\mathbf{x}_2$ are orthogonal.
$\tombstone$
Orthogonal matrices have a complex-valued counterpart called unitary matrices .
Definition of Unitary Matrix ¶
A matrix $U\in\mathbb{C}^{n\times n}$ is unitary if its columns form an orthonormal set in $\mathbb{C}^n$.
An equivalent statement is that a matrix is unitary if and only if $U^H U = I$. Since $U$ is square with orthonormal columns it is rank $n$ and hence invertible. These facts also imply that $U^H = U^{-1}$.
Corollary 7.3.2 ¶
If a Hermitian matrix $A$ has distinct eigenvalues, then there exists a unitary matrix $U$ which diagonalizes $A$.
Let $\mathbf{x}_i$ be an eigenvector corresponding to eigenvalue $\lambda_i$ of $A$. For each $i=1,2,\ldots,n$, set
$$ \mathbf{u}_i = \dfrac{\mathbf{x}_i}{\|\mathbf{x}_i\|} $$
so $\mathbf{u}_i$ is a unit eigenvector for $\lambda_i$. By
Theorem 5.3.1
, $\left\{\mathbf{u}_1,\mathbf{u}_2,\ldots,\mathbf{u}_n\right\}$ is an orthonormal set in $\mathbf{C}^n$. Let
$$ U = \begin{bmatrix}\ \mathbf{u}_1 & \mathbf{u}_2 & \ldots & \mathbf{u}_n\ \end{bmatrix} $$
so $U$ is unitary and diagonalizes $A$.
$\tombstone$
Let
$$ A = \begin{bmatrix} -1 & 2 + i \\ 2 - i & 3 \end{bmatrix} $$
Find a unitary matrix $U$ that diagonalizes $A$.
Theorem 7.3.3 ¶
Schur Decomposition
For each $n\times n$ matrix $A$, there is a unitary matrix $\,U$ such that $\,U^H\! AU$ is an upper triangular matrix.
We prove the theorem by mathematical induction . For the case of $n=1$, no work is required and the result clearly holds. Now, suppose that the result holds for the $(n-1)^\text{st}$ case of $(n-1)\times (n-1)$ matrices. We need to show that this implies the result holds for $n\times n$ matrices.
Let $A$ be an $n\times n$ matrix with eigenvalues $\lambda_1,\lambda_2,\ldots,\lambda_n$ including multiplicities. Let $\mathbf{v}_1$ be the eigenvector corresponding to $\lambda_1$ with $\|\mathbf{v}_1\| = 1$. By the Gram-Schmidt process, we use this $\mathbf{v}_1$ to construct an orthonormal basis $\left\{\mathbf{v}_1,\mathbf{v}_2,\ldots,\mathbf{v}_n\right\}$ for $\mathbb{C}^n$. Define $V$ such that
$$ V = \begin{bmatrix} \ \mathbf{v}_1 & \mathbf{v}_2 & \ldots & \mathbf{v}_n \end{bmatrix} $$
so $V$ is a unitary matrix.
The first column of the matrix $V^H\! AV$ is given by $V^H\! A\mathbf{v}_1$, so we know that
$$ V^H\! A\mathbf{v}_1 = \lambda_1 V^H \mathbf{v}_1 = \lambda_1\mathbf{e}_1 $$
and therefore
$$ V^H\! AV = \left[\begin{array}{c|ccc} \lambda_1 & x & \ldots & x \\ \hline 0 \\ \vdots & & \overset{\sim}{A} \\ 0 \end{array}\right] $$
where $\overset{\sim}{A}$ is an $(n-1)\times (n-1)$ matrix. Since $V$ is a unitary matrix, it must be the case that $A$ and the interior matrix have the same eigenvalues. Hence, $p_A(\lambda) = (\lambda_1 - \lambda)p_{\overset{\sim}{A}}(\lambda)$ and the eigenvalues of $\overset{\sim}{A}$ (including multiplicities) are $\lambda_2,\ldots,\lambda_n$.
By the induction hypothesis, it is possible to write $\overset{\sim}{A}$ as
$$ \overset{\sim}{A} = \overset{\sim}{W}\begin{bmatrix} \lambda_2 & x & \ldots & x \\ 0 & \ddots & \ddots & \vdots \\ \vdots & \ddots & \ddots & x \\ 0 & \ldots & 0 & \lambda_n \end{bmatrix} \overset{\sim}{W}^H $$
so
$$ \tilde{W}^H\! \tilde{A}\tilde{W} = \begin{bmatrix} \lambda_2 & x & \ldots & x \\ 0 & \ddots & \ddots & \vdots \\ \vdots & \ddots & \ddots & x \\ 0 & \ldots & 0 & \lambda_n \end{bmatrix} $$
We may now write
$$ \begin{align*}
\left[\begin{array}{c|ccc} 1 & 0 & \ldots & 0 \\ \hline 0 \\ \vdots & & \overset{\sim}{W} \\ 0 \end{array}\right]^H & \left[\begin{array}{c|ccc} \lambda_1 & x & \ldots & x \\ \hline 0 \\ \vdots & & \overset{\sim}{A} \\ 0 \end{array}\right]\left[\begin{array}{c|ccc} 1 & 0 & \ldots & 0 \\ \hline 0 \\ \vdots & & \overset{\sim}{W} \\ 0 \end{array}\right] \\
\\
&= \left[\begin{array}{c|ccc} 1 & 0 & \ldots & 0 \\ \hline 0 \\ \vdots & & \overset{\sim}{W}^H \\ 0 \end{array}\right]\left[\begin{array}{c|ccc} \lambda_1 & x & \ldots & x \\ \hline 0 \\ \vdots & & \overset{\sim}{A}\overset{\sim}{W} \\ 0 \end{array}\right] \\
\\
&= \left[\begin{array}{c|ccc} \lambda_1 & x & \ldots & x \\ \hline 0 \\ \vdots & & \overset{\sim}{W}^H\!\overset{\sim}{A}\overset{\sim}{W} \\ 0 \end{array}\right] \\
\\
&= \left[\begin{array}{c|ccc} \lambda_1 & x & \ldots & x \\ \hline 0 & \lambda_2 & \ldots & x \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \ldots & \lambda_n \end{array}\right]
\end{align*} $$
Defining the matrix $W = \left[\begin{array}{c|c} 1 & 0 \\ \hline 0 & \overset{\sim}{W} \end{array}\right]$, which is unitary, we have
$$ \left[\begin{array}{c|ccc} \lambda_1 & x & \ldots & x \\ \hline 0 \\ \vdots & & \overset{\sim}{A} \\ 0 \end{array}\right] = W \left[\begin{array}{cccc} \lambda_1 & x & \ldots & x \\ 0 & \lambda_2 & \ldots & x \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \ldots & \lambda_n \end{array}\right] W^H $$
which proves the theorem since
$$ A = V\left[\begin{array}{c|ccc} \lambda_1 & x & \ldots & x \\ \hline 0 \\ \vdots & & \overset{\sim}{A} \\ 0 \end{array}\right] V^H = VW \left[\begin{array}{cccc} \lambda_1 & x & \ldots & x \\ 0 & \lambda_2 & \ldots & x \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \ldots & \lambda_n \end{array}\right] (VW)^H $$
and the product of unitary matrices is unitary.
$\tombstone$
In the special case that $A$ is a Hermitian matrix, the triangular matrix from a Schur decomposition will be diagonal.
Theorem 7.3.4 ¶
Spectral Theorem
If $A$ is a Hermitian matrix, then there exists a unitary matrix $U$ that diagonalizes $A$.
By Theorem 6.4.3, we know that there is a matrix $U$ such that $T = U^H\! A U$ is upper triangular. In addition,
$$ T^H = (U^H\! A U)^H = U^H\! A^H U = U^H\! A U = T $$
so $T$ is Hermitian and therefore must be diagonal.
$\tombstone$
Creative Commons Attribution-NonCommercial-ShareAlike 4.0
Attribution
You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
Noncommercial
You may not use the material for commercial purposes.
Share Alike
You are free to share, copy and redistribute the material in any medium or format. If you adapt, remix, transform, or build upon the material, you must distribute your contributions under the
same license
as the original.