Math 511: Linear Algebra

7.3 Hermitian Matrices

7.3.1 Complex Valued Vectors and Matrices¶

$$ \require{color} \definecolor{brightblue}{rgb}{.267, .298, .812} \definecolor{darkblue}{rgb}{0.0, 0.0, 1.0} \definecolor{palepink}{rgb}{1, .73, .8} \definecolor{softmagenta}{rgb}{.99,.34,.86} \definecolor{blueviolet}{rgb}{.537,.192,.937} \definecolor{jonquil}{rgb}{.949,.792,.098} \definecolor{shockingpink}{rgb}{1, 0, .741} \definecolor{royalblue}{rgb}{0, .341, .914} \definecolor{alien}{rgb}{.529,.914,.067} \definecolor{crimson}{rgb}{1, .094, .271} \def\ihat{\mathbf{\hat{\unicode{x0131}}}} \def\jhat{\mathbf{\hat{\unicode{x0237}}}} \def\khat{\mathrm{\hat{k}}} \def\tombstone{\unicode{x220E}} \def\contradiction{\unicode{x2A33}} $$

Up until this point, we have primarily discussed vectors and matrices in $\mathbb{R}^n$, with occasional reference to the possibility of working with complex values. It is now time to address complex values directly. For vectors, we will consider $\mathbb{C}^n$ where vectors $\mathbf{z}\in\mathbb{C}^n$ have the form

$$ \mathbf{z} = \begin{bmatrix} z_1 \\ z_2 \\ \vdots \\ z_n \end{bmatrix} = \begin{bmatrix} x_1 + y_1 i \\ x_2 + y_2 i \\ \vdots \\ x_n + y_n i \end{bmatrix} $$

with $x_j,y_j\in\mathbb{R}$ for all $j=1,2,\ldots,n$ and $i = \sqrt{-1}$ is the imaginary unit. The scalar field for complex-valued vectors will be $\mathbb{C}$.

Complex Inner Products¶

If $\alpha = a + bi$ is a complex-valued scalar, its length is given by

$$ \left|\alpha\right| = \sqrt{\alpha^*\alpha} = \sqrt{a^2 + b^2} $$

where $\alpha^* = a - bi$, the complex conjugate of $\alpha$. The length of a vector

$$ \mathbf{z} = \begin{bmatrix} z_1 \\ z_2 \\ \vdots \\ z_n \end{bmatrix} $$

is given by

$$ \begin{align*} \left\| \mathbf{z} \right\| &= \left(|z_1|^2 + |z_2|^2 + \ldots + |z_n|^2\right)^\frac{1}{2} \\ \\ &= \left(z_1^*z_1 + z^*_2 z_2 + \ldots + z_n^* z_n\right)^\frac{1}{2} \\ \\ &= \left((\mathbf{z}^*)^T \mathbf{z}\right)^\frac{1}{2} \end{align*} $$

For convenience, we define the symbol $\mathbf{z}^H$ to be the conjugate transpose or Hermitian transpose of the vector $\mathbf{z}$, so

$$ (\mathbf{z}^*)^T = \mathbf{z}^H \qquad\qquad\qquad \|\mathbf{z}\| = \left(\mathbf{z}^H\mathbf{z}\right)^\frac{1}{2} $$

Definition¶

Complex Inner Product¶

Let $V$ be a vector space over the complex numbers. An inner product on $V$ is a operation that assigns to each pair of vectors $\mathbf{z},\mathbf{w}\in V$ a complex number $\langle \mathbf{z},\mathbf{w}\rangle$ satisfying the conditions

I. $\langle \mathbf{z},\mathbf{z} \rangle \geq 0$, and $\langle \mathbf{z},\mathbf{z} \rangle = 0$ if and only if $\,\mathbf{z} = \mathbf{0}$.

II. $\langle \mathbf{z},\mathbf{w} \rangle = \langle \mathbf{w},\mathbf{z} \rangle^*$ for all $\,\mathbf{z},\mathbf{w}\in V$.

III. $\langle \alpha\mathbf{z} + \beta\mathbf{w},\mathbf{u} \rangle = \alpha\langle\mathbf{z},\mathbf{u}\rangle + \beta\langle\mathbf{w},\mathbf{u}\rangle$.

Note that condition II states a complex inner product is conjugate symmetric instead of merely symmetric. It will require some simple changes, but taking this fact into account will allows use our previous theorems on real inner product spaces for complex inner product spaces. It is particularly useful for us to recover the theorems leading to Parseval's Identity, so lets restate those with an complex valued inner product:

We absolutely must learn the vocabulary and important properties of complex-valued vectors in $\mathbb{C}^n$, complex-valued matrices in vector spaces $\mathbb{C}^{m\times n}$. This is due to the fact that the characteristic polynomial of a real-valued matrix may have complex-conjugate roots. We will utilize complex-valued matrices to prove some really important properties their real-valued cousins. Complex-valued vectors and matrices to appear in STEM. They are extensively utilized in any field requiring periodic mathematical models or wave phenomena.

Most of what we defined for Complex inner product spaces, comes from generalizing our knowledge of real inner product spaces if we mind the following adjustments:

$ \mathbb{R}^n $	$\qquad$	$ \mathbb{C}^n $
$\langle \mathbf{x},\mathbf{y} \rangle = \mathbf{y}^T\mathbf{x}$		$\langle \mathbf{z},\mathbf{w} \rangle = \mathbf{w}^H\mathbf{z}$
$\mathbf{x}^T\mathbf{y} = \mathbf{y}^T\mathbf{x}$		$\mathbf{z}^H\mathbf{w} = \overline{\mathbf{w}^H\mathbf{z}}$
$\\|\mathbf{x}\\|^2 = \mathbf{x}^T\mathbf{x}$		$\\|\mathbf{z}\\|^2 = \mathbf{z}^H\mathbf{z}$

7.3.2 Why Orthonormal Bases are Important¶

We require orthonormal bases of our inner product spaces due to their usefulness performing computations. It will almost always be the case that when we require a basis for a vector space in STEM, we require an orthonormal basis.

Theorem 7.3.1¶

If $B = \left\{ \mathbf{w}_1, \mathbf{w}_2, \dots, \mathbf{w}_n \right\}$ is an orthonormal basis for $n$-dimensional inner product space $V$, then the scalar projection of any vector onto a basis vector $\mathbf{w}_k$ is the $k^{\text{th}}$ coordinate of the vector with respect to the orthonormal basis $B$.

If $\mathbf{z}\in V$ and

$$ \mathbf{z} = \sum_{i=1}^n c_i\mathbf{w}_i = c_1\mathbf{w}_1 + c_2\mathbf{w}_2 + \ldots + c_n\mathbf{w}_n $$

then

$$ c_i = \langle \mathbf{z},\mathbf{w}_i \rangle $$

Proof:¶

To show that theorem 7.3.1 is true one need only compute the inner product of vector $\mathbf{z}$ with any basis vector $\mathbf{w}_k$,

$$ \begin{align*} \left\langle \mathbf{z},\mathbf{w}_k \right\rangle &= \left\langle \sum_{i=1}^n c_i\mathbf{w}_i, \mathbf{w}_i \right\rangle = \sum_{i=1}^n \langle c_i\mathbf{w}_i, \mathbf{w}_k \rangle \\ \\ &= \sum_{i=1}^n c_i\langle \mathbf{w}_i, \mathbf{w}_k \rangle = \sum_{i=1}^n c_i\delta_{ik} = c_k \\ \\ \end{align*} $$

Thus $c_k = \langle \mathbf{z}, \mathbf{w}_k \rangle$.$\tombstone$

We are comfortable with the fact that each coordinate of our vectors are in fact the projection of the vector on to the appropriate coordinate axis when we use the canonical basis vectors

$$ \left\{ \mathbf{e}_1,\ \mathbf{e}_2,\ \dots,\ \mathbf{e}_n \right\} $$

We learn from this theorem that for any orthonormal basis, that each coordinate with respect to this non-standard orthonormal basis is the inner product of the vector onto the corresponding axis. The axis for coordinate $c_i$ of vector $\mathbf{z}$ is the projection of $\mathbf{z}$ onto $\text{Span}\left\{ \mathbf{w}_i \right\}$.

Likewise, the inner product of two vectors in a finite dimensional inner product space can be computed using only the coordinates of each vector.

Corollary 7.3.2¶

The Inner Product of Vectors with respect to an Orthonormal Basis in finite dimensional inner product space $V$ is the Sum of the Component-wise Product

Let $\left\{\mathbf{w}_1,\mathbf{w}_2,\ldots,\mathbf{w}_n\right\}$ be an orthonormal basis for a complex inner product space $V$. If

$$ \mathbf{z} = \sum_{i=1}^n c_i\mathbf{w}_i \qquad\qquad \mathbf{u} = \sum_{i=1}^n d_i\mathbf{w}_i $$

then

$$ \langle \mathbf{z},\mathbf{u} \rangle = \sum_{i=1}^n c_i d_i^* \qquad\qquad \langle \mathbf{u},\mathbf{z} \rangle = \sum_{i=1}^n d_i c_i^* $$

Proof:¶

As in theorem 7.3.1, one need only compute the dot product of the two vectors.

$$ \begin{align*} \langle \mathbf{z},\mathbf{u} \rangle &= \left\langle \displaystyle\sum_{i=1}^n c_i\mathbf{w}_i, \mathbf{u} \right\rangle = \displaystyle\sum_{i=1}^n \left\langle c_i\mathbf{w}_i, \mathbf{u} \right\rangle \\ \\ &= \displaystyle\sum_{i=1}^n c_i\left\langle \mathbf{w}_i, \mathbf{u} \right\rangle = \displaystyle\sum_{i=1}^n c_i\left\langle \mathbf{u},\mathbf{w}_i \right\rangle^* \\ \\ &= \displaystyle\sum_{i=1}^n c_i\left\langle \displaystyle\sum_{j=1}^n d_j\mathbf{w}_j,\mathbf{w}_i \right\rangle^* = \displaystyle\sum_{i=1}^n c_i \left( \displaystyle\sum_{j=1}^n \left\langle d_j\mathbf{w}_j,\mathbf{w}_i \right\rangle^* \right) \\ \\ &= \displaystyle\sum_{i=1}^n c_i \left( \displaystyle\sum_{j=1}^n d_j^*\left\langle \mathbf{w}_j,\mathbf{w}_i \right\rangle^* \right) = \displaystyle\sum_{i=1}^n \left( \displaystyle\sum_{j=1}^n c_id_j^*\left\langle \mathbf{w}_j,\mathbf{w}_i \right\rangle^* \right) \\ \\ &= \displaystyle\sum_{i=1}^n \left( \displaystyle\sum_{j=1}^n c_id_j^*\delta_{ji}^* \right) = \displaystyle\sum_{i=1}^n c_id_i^* \end{align*} $$

The proof for $\left\langle \mathbf{w},\mathbf{u} \right\rangle$ is very similar, so this completes the proof.$\tombstone$

This shows that our reasoning for creating the matrix representation of $Three blue one brown lessons for dot products and duality$ dot product remains the same for any orthonormal basis.

Finally, we can prove another interesting property about the norm of a vector in an inner product space.

Corollary 7.3.3¶

Parseval's Identity¶

If $\left\{\mathbf{u}_1,\mathbf{u}_2,\ldots,\mathbf{u}_n\right\}$ is an orthonormal basis for a complex inner product space $V$ and

$$ \mathbf{z} = \sum_{i=1}^n c_i\mathbf{w}_i $$

then

$$ \|\mathbf{z}\|^2 = \langle \mathbf{z},\mathbf{z} \rangle = \sum_{i=1}^n c_i\bar{c}_i $$

$$ \langle \mathbf{z},\mathbf{w} \rangle = \mathbf{w}^H \mathbf{z} $$

Exercise 1¶

Given

$$ \mathbf{z} = \begin{bmatrix} 1 + 3i \\ -2 -i \end{bmatrix} \qquad\qquad\qquad \mathbf{w} = \begin{bmatrix} 3 + 2i \\ 5 - i \end{bmatrix} $$
compute

$\mathbf{w}^H\mathbf{z}$
$\mathbf{z}^H\mathbf{z}$
$\mathbf{w}^H\mathbf{w}$

Follow Along

$$ \begin{align*} \mathbf{w}^H\mathbf{z} &= \begin{bmatrix} 3 - 2i & 5 + i \end{bmatrix}\begin{bmatrix} 1 + 3i \\ -2 -i \end{bmatrix} \\ \\ &= (3-2i+9i-6i^2) + (-10-2i-5i-i^2) \\ \\ &= (9+7i) + (-9 - 7i) = 0 \\ \\ \mathbf{z}^H\mathbf{z} &= |1+3i|^2 + |-2-i|^2 = 15 \\ \\ \mathbf{w}^H\mathbf{w} &= |3+2i|^2 + |5-i|^2 = 39 \end{align*} $$

7.3.3 Hermitian Matrices¶

Let $Z\in\mathbb{C}^{m\times n}$ define a $m\times n$ matrix comprised of complex numbers. We can write each entry of $Z$ according to its component form $[z_{jk}]$ where each $z_{jk} = a_{jk} + ib_{jk}$ for $a_{jk},b_{jk}\in\mathbb{R}$.

Using the definition of matrix addition, it is possible for us to write $Z$ as the sum

$$ Z = [z_{jk}] = [a_{jk} + ib_{jk}] = [a_{jk}] + i[b_{jk}] = A + iB $$

where $A = [a_{jk}]$ and $B = [b_{jk}]$ are both in $\mathbb{R}^{m\times n}$. This allows us to define the conjugate of the matrix $Z$ as

$$ Z^* = A - iB $$

Simply put, this works in the way we are accustomed to conjugates operating, just negate all of the imaginary terms. For complex-valued matrices, we also need to make sure that we use the conjugate transpose, so $Z^H = \left(Z^*\right)^T$ just as with complex vectors. The conjugate transpose follows rules similar to what we are familiar with for the typical transpose

$\left(Z^H\right)^H = Z$
$\left( \alpha Z + \beta W \right)^H = \bar{\alpha}Z^H + \bar{\beta}W^H$
$\left( ZW \right)^H = W^H Z^H $

Definition¶

Hermitian matrix¶

A matrix $Z$ is called Hermitian if $Z = Z^H$.

Example 1¶

Both

$$ Z = \begin{bmatrix} -1 & 2 - 2i \\ 2 + 2i & 3 \end{bmatrix} \qquad\qquad\qquad W = \begin{bmatrix} 3 & 5 + i & 4 - 2i \\ 5 - i & -7 & -3 + 2i \\ 4 + 2i & -3 - 2i & 2 \end{bmatrix} $$
are Hermitian since

$$ \begin{align*} Z^H = \begin{bmatrix} \overline{-1} & \overline{2 + 2i} \\ \overline{2 - 2i} & \overline{3} \end{bmatrix} &= \begin{bmatrix} -1 & 2 - 2i \\ 2 + 2i & 3 \end{bmatrix} = Z \\ \\ W^H = \begin{bmatrix} \overline{3} & \overline{5 - i} & \overline{4 + 2i} \\ \overline{5 + i} & \overline{-7} & \overline{-3 - 2i} \\ \overline{4 - 2i} & \overline{-3 + 2i} & \overline{2} \end{bmatrix} &= \begin{bmatrix} 3 & 5 + i & 4 - 2i \\ 5 - i & -7 & -3 + 2i \\ 4 + 2i & -3 - 2i & 2 \end{bmatrix} = W \end{align*} $$

Theorem 7.3.4¶

Hermitian matrices have real eigenvalues and eigenvectors corresponding to distinct eigenvalues are orthogonal.

Proof:¶

Suppose $A$ is a Hermitian matrix, and let $\lambda$ be an eigenvalue of $A$ with corresponding eigenvector $\mathbf{x}$. Set the scalar $\alpha = \mathbf{x}^H\! A\mathbf{x}$, then

$$ \bar{\alpha} = \alpha^H = \left(\mathbf{x}^H\! A\mathbf{x}\right)^H = \mathbf{x}^H A^H \left(\mathbf{x}^H\right)^H = \mathbf{x}^H\! A\mathbf{x} = \alpha $$

So $\alpha\in\mathbb{R}$. Also,

$$ \alpha = \mathbf{x}^H\! A\mathbf{x} = \mathbf{x}^H\! \lambda\mathbf{x} = \lambda\mathbf{x}^H \mathbf{x} = \lambda\|\mathbf{x}\|^2 $$

which implies that $\lambda\in\mathbb{R}$ since

$$ \lambda = \dfrac{\alpha}{\|\mathbf{x}\|^2} $$

Next, we want to show that eigenvectors for distinct eigenvalues are orthogonal, so let $\mathbf{x}_1$ and $\mathbf{x}_2$ be eigenvectors for $\lambda_1$ and $\lambda_2$, respectively. The expressions

$$ \begin{align*} \left(A\mathbf{x}\right)^H \mathbf{x}_2 = \mathbf{x}_1^2 A^H \mathbf{x}_2 &= \mathbf{x}_1^2 A \mathbf{x}_2 = \lambda_2 \mathbf{x}_1^H \mathbf{x}_2 \\ \\ \left(A\mathbf{x}\right)^H \mathbf{x}_2 = \left(\mathbf{x}_2^2 A\mathbf{x}_1\right)^H &= \left(\lambda_1\mathbf{x}_2^H \mathbf{x}_1\right)^H = \lambda_1 \mathbf{x}_1^H \mathbf{x}_2 \end{align*} $$

show that

$$ \lambda_1 \mathbf{x}_1^H \mathbf{x}_2 = \lambda_2 \mathbf{x}_1^H \mathbf{x}_2 $$

which can only mean that $\langle \mathbf{x}_1, \mathbf{x}_2 \rangle = 0$, since $\lambda_1 \neq \lambda_2$. $\tombstone$

7.3.4 Unitary Matrices¶

Orthogonal matrices have a complex-valued counterpart called unitary matrices.

Definition of Unitary Matrix¶

A matrix $U\in\mathbb{C}^{n\times n}$ is unitary if its columns form an orthonormal set in inner product space $\mathbb{C}^n$.

Lemma 7.3.5¶

If $U$ is a unitary matrix, then $U$ is invertible and $U^H = U^{-1}$.

Proof¶

Since a Unitary matrix $U$ is square, then as in orthogonal matrices we have

$$ U^HU = \begin{bmatrix} \delta_{jk} \end{bmatrix} = I_n $$

Hence $UU^H = U^HU = I_n$, and $U^H = U^{-1}$.$\tombstone$

lemma 7.3.6¶

For every Hermitian matrix with distinct eigenvalues (all different), there is a unitary diagonalizing matrix.

Proof:¶

Let $A$ be an Hermitian $n\times n$ matrix with $n$ distince eigenvalues $\lambda_j$, $1\le j\le n$. Using theorem 7.3.4, each eigenvalue is real, and eigenvectors of distinct eigenvalues are orthogonal.

Let $\mathbf{x}_i$ be an eigenvector corresponding to eigenvalue $\lambda_i$ of $A$. For each $i=1,2,\ldots,n$, set

$$ \mathbf{u}_i = \dfrac{\mathbf{x}_i}{\|\mathbf{x}_i\|} $$

so $\mathbf{u}_i$ is a unit eigenvector for $\lambda_i$. By Theorem 6.4.1, $\left\{\mathbf{u}_1,\mathbf{u}_2,\ldots,\mathbf{u}_n\right\}$ is an orthonormal set in $\mathbf{C}^n$. Let

$$ U = \begin{bmatrix}\ \mathbf{u}_1 & \mathbf{u}_2 & \ldots & \mathbf{u}_n\ \end{bmatrix} $$

so $U$ is unitary and diagonalizes $A$.∎

Exercise 2¶

Let

$$ A = \begin{bmatrix} -1 & 2 + i \\ 2 - i & 3 \end{bmatrix} $$
Find a unitary matrix $U$ that diagonalizes $A$.

Follow Along

We begin by finding the eigenvalues of $A$

$$ \begin{align*} p(\lambda) = |A - \lambda I| &= \begin{bmatrix} -1 - \lambda & 2 + i \\ 2 - i & 3 - \lambda \end{bmatrix} \\ \\ &= (-1-\lambda)(3-\lambda) - (2-i)(2+i) \\ \\ &= \lambda^2 - 2\lambda - 8 \\ \\ &= (\lambda + 2)(\lambda - 4) \end{align*} $$
So the eigenvalues of $A$ are $\lambda_1 = -2$ and $\lambda_2 = 4$. We want to find the associated unit eigenvectors $\mathbf{u}_1$ and $\mathbf{u}_2$, so we determine bases for $N(A + 2I)$ and $N(A - 4I)$ by determine the reduced row echelon form of each of these matrices.

$$ \begin{align*} A + 2I &= \begin{bmatrix} 1 & 2+i \\ 2-i & 5 \end{bmatrix}\xrightarrow[-(2-i)R_1 + R_2]{~} \begin{bmatrix} 1 & 2+i \\ 0 & 0 \end{bmatrix} \\ \\ A - 4I &= \begin{bmatrix} -5 & 2+i \\ 2-i & -1 \end{bmatrix}\xrightarrow[\frac{1}{2+i}R_1 + R_2]{~} \begin{bmatrix} -5 & 2+i \\ 0 & 0 \end{bmatrix} \end{align*} $$
From these relationships, we see that eigenvectors associated with $\lambda_1$ and $\lambda_2$ are

$$ \mathbf{x}_1 = \begin{bmatrix} 2 + i \\ -1 \end{bmatrix} \qquad\qquad\qquad \mathbf{x}_2 = \begin{bmatrix} 2 + i \\ 5 \end{bmatrix} $$
respectively. We can normalize each of these vectors to determine the unit eigenvectors

$$ \mathbf{u}_1 = \begin{bmatrix} \frac{2}{\sqrt{6}} + \frac{i}{\sqrt{6}} \\ -\frac{1}{\sqrt{6}} \end{bmatrix} \qquad\qquad\qquad \mathbf{u}_2 = \begin{bmatrix} \frac{2}{\sqrt{30}} + \frac{i}{\sqrt{30}} \\ \frac{5}{\sqrt{30}} \end{bmatrix} $$
Hence the unitary matrix $U$ that diagonalizes $A$ is given by

$$ U = \begin{bmatrix} \frac{2}{\sqrt{6}} + \frac{i}{\sqrt{6}} & \frac{2}{\sqrt{30}} + \frac{i}{\sqrt{30}} \\ -\frac{1}{\sqrt{6}} & \frac{5}{\sqrt{30}} \end{bmatrix} $$
and we have that

$$ A = UDU^T = \begin{bmatrix} \frac{2}{\sqrt{6}} + \frac{i}{\sqrt{6}} & \frac{2}{\sqrt{30}} + \frac{i}{\sqrt{30}} \\ -\frac{1}{\sqrt{6}} & \frac{5}{\sqrt{30}} \end{bmatrix} \begin{bmatrix} -2 & 0 \\ 0 & 4 \end{bmatrix} \begin{bmatrix} \frac{2}{\sqrt{6}} + \frac{i}{\sqrt{6}} & -\frac{1}{\sqrt{6}} \\ \frac{2}{\sqrt{30}} + \frac{i}{\sqrt{30}} & \frac{5}{\sqrt{30}} \end{bmatrix} $$

7.3.5 Schur Decomposition and the Spectral Theorem¶

Theorem 7.3.7¶

Schur Decomposition Theorem¶

For each $n\times n$ matrix $A$, there is a unitary matrix $\,U$ such that $\,U^H\! AU$ is an upper triangular matrix.

Proof:¶

We prove the theorem by mathematical induction. For the case of $n=1$, no work is required and the result clearly holds. Now, suppose that the result holds for the $(n-1)^\text{st}$ case of $(n-1)\times (n-1)$ matrices. We need to show that this implies the result holds for $n\times n$ matrices.

Let $A$ be an $n\times n$ matrix with eigenvalues $\lambda_1,\lambda_2,\ldots,\lambda_n$ including multiplicities. Let $\mathbf{v}_1$ be the eigenvector corresponding to $\lambda_1$ with $\|\mathbf{v}_1\| = 1$. By the Gram-Schmidt process, we use this $\mathbf{v}_1$ to construct an orthonormal basis $\left\{\mathbf{v}_1,\mathbf{v}_2,\ldots,\mathbf{v}_n\right\}$ for $\mathbb{C}^n$. Define $V$ such that

$$ V = \begin{bmatrix} \ \mathbf{v}_1 & \mathbf{v}_2 & \ldots & \mathbf{v}_n \end{bmatrix} $$

so $V$ is a unitary matrix.

The first column of the matrix $V^H\! AV$ is given by $V^H\! A\mathbf{v}_1$, so we know that

$$ V^H\! A\mathbf{v}_1 = \lambda_1 V^H \mathbf{v}_1 = \lambda_1\mathbf{e}_1 $$

and therefore

$$ V^H\! AV = \left[\begin{array}{c|ccc} \lambda_1 & x & \ldots & x \\ \hline 0 \\ \vdots & & \overset{\sim}{A} \\ 0 \end{array}\right] $$

where $\overset{\sim}{A}$ is an $(n-1)\times (n-1)$ matrix. Since $V$ is a unitary matrix, it must be the case that $A$ and the interior matrix have the same eigenvalues. Hence, $p_A(\lambda) = (\lambda_1 - \lambda)p_{\overset{\sim}{A}}(\lambda)$ and the eigenvalues of $\overset{\sim}{A}$ (including multiplicities) are $\lambda_2,\ldots,\lambda_n$.

By the induction hypothesis, it is possible to write $\overset{\sim}{A}$ as

$$ \overset{\sim}{A} = \overset{\sim}{W}\begin{bmatrix} \lambda_2 & x & \ldots & x \\ 0 & \ddots & \ddots & \vdots \\ \vdots & \ddots & \ddots & x \\ 0 & \ldots & 0 & \lambda_n \end{bmatrix} \overset{\sim}{W}^H $$

$$ \tilde{W}^H\! \tilde{A}\tilde{W} = \begin{bmatrix} \lambda_2 & x & \ldots & x \\ 0 & \ddots & \ddots & \vdots \\ \vdots & \ddots & \ddots & x \\ 0 & \ldots & 0 & \lambda_n \end{bmatrix} $$

We may now write

$$ \begin{align*} \left[\begin{array}{c|ccc} 1 & 0 & \ldots & 0 \\ \hline 0 \\ \vdots & & \overset{\sim}{W} \\ 0 \end{array}\right]^H & \left[\begin{array}{c|ccc} \lambda_1 & x & \ldots & x \\ \hline 0 \\ \vdots & & \overset{\sim}{A} \\ 0 \end{array}\right]\left[\begin{array}{c|ccc} 1 & 0 & \ldots & 0 \\ \hline 0 \\ \vdots & & \overset{\sim}{W} \\ 0 \end{array}\right] \\ \\ &= \left[\begin{array}{c|ccc} 1 & 0 & \ldots & 0 \\ \hline 0 \\ \vdots & & \overset{\sim}{W}^H \\ 0 \end{array}\right]\left[\begin{array}{c|ccc} \lambda_1 & x & \ldots & x \\ \hline 0 \\ \vdots & & \overset{\sim}{A}\overset{\sim}{W} \\ 0 \end{array}\right] \\ \\ &= \left[\begin{array}{c|ccc} \lambda_1 & x & \ldots & x \\ \hline 0 \\ \vdots & & \overset{\sim}{W}^H\!\overset{\sim}{A}\overset{\sim}{W} \\ 0 \end{array}\right] \\ \\ &= \left[\begin{array}{c|ccc} \lambda_1 & x & \ldots & x \\ \hline 0 & \lambda_2 & \ldots & x \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \ldots & \lambda_n \end{array}\right] \end{align*} $$

Defining the matrix $W = \left[\begin{array}{c|c} 1 & 0 \\ \hline 0 & \overset{\sim}{W} \end{array}\right]$, which is unitary, we have

$$ \left[\begin{array}{c|ccc} \lambda_1 & x & \ldots & x \\ \hline 0 \\ \vdots & & \overset{\sim}{A} \\ 0 \end{array}\right] = W \left[\begin{array}{cccc} \lambda_1 & x & \ldots & x \\ 0 & \lambda_2 & \ldots & x \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \ldots & \lambda_n \end{array}\right] W^H $$

which proves the theorem since

$$ A = V\left[\begin{array}{c|ccc} \lambda_1 & x & \ldots & x \\ \hline 0 \\ \vdots & & \overset{\sim}{A} \\ 0 \end{array}\right] V^H = VW \left[\begin{array}{cccc} \lambda_1 & x & \ldots & x \\ 0 & \lambda_2 & \ldots & x \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \ldots & \lambda_n \end{array}\right] (VW)^H $$

and the product of unitary matrices is unitary. ∎

In the special case that $A$ is a Hermitian matrix, the triangular matrix from a Schur decomposition will be diagonal.

Theorem 7.3.8¶

Spectral Theorem¶

If $A$ is a Hermitian matrix, then there exists a unitary matrix $U$ that diagonalizes $A$.

Proof¶

By Theorem 7.3.7, we know that there is a matrix $U$ such that $T = U^H\! A U$ is upper triangular. In addition,

$$ T^H = (U^H\! A U)^H = U^H\! A^H U = U^H\! A U = T $$

so $T$ is Hermitian and therefore must be diagonal.∎

Your use of this self-initiated mediated course material is subject to our $An international nonprofit organization that empowers people to grow and sustain the thriving commons of shared knowledge and culture.$ Creative Commons License 4.0