A simple way to model weather is to assume tomorrow's weather depends only on today's — a "memoryless" assumption that makes it a Markov chain. The states are the possible weather conditions; here, Sunny, Cloudy, Rainy.
Building the matrix. Each column describes tomorrow's weather given today's. Suppose that after a sunny day there is a 70% chance of sun, 20% cloud, and 10% rain. Those three numbers form the "from Sunny" column:
$$ \text{column } 1 = \begin{bmatrix} 0.7 \\ 0.2 \\ 0.1 \end{bmatrix}. $$
Filling in the cloudy and rainy days the same way gives
$$P = \begin{bmatrix} .7 & .3 & .2 \\ .2 & .4 & .3 \\ .1 & .3 & .5 \end{bmatrix}$$
Each column sums to $1$ because, whatever today's weather, tomorrow is certainly one of the three conditions.
Long-run behavior. Every entry of $P$ is positive, so $P$ is regular and Theorem 2 guarantees a unique steady state, approached no matter what today's weather is.
Solve $(P-I)\overline{X}=\mathbf{0}$ with $x_1+x_2+x_3=1$ to find the steady state.
What fraction of days are sunny in the long run? Show all of your steps.
Compute the determinant $|P - tI|$. This will create a polynomial with variable $t$. You'll need to use the Laplace expansion. Show all of your steps. This is called the characteristic polynomial of $P$.
Imagine a person browsing the web who, on each page, clicks one of that page's outgoing links at random, then repeats — forever. This is a Markov chain: the "states" are the web pages, and the transition probabilities come from the links. The question PageRank answers is which pages are most important? The guiding idea is circular — a page is important if important pages link to it — and the steady state is exactly what resolves that circularity: it is the long-run fraction of time the random surfer spends on each page.
Consider a tiny web of four pages with these links:
- Page 1 links to page 2.
- Page 2 links to pages 1 and 3.
- Page 3 links to pages 1 and 4.
- Page 4 links to page 2.
Turning the links into a matrix. Each column describes where the surfer goes from one page, dividing probability equally among that page's outgoing links. For example, page 2 links to pages 1 and 3, so a surfer on page 2 moves to page 1 or page 3 with probability $\tfrac{1}{2}$ each, and to no other page — that is column 2:
$$ \text{column } 2 = \begin{bmatrix} \tfrac{1}{2} \\ 0 \\ \tfrac{1}{2} \\ 0 \end{bmatrix}. $$
Reading off every page the same way gives
$$ P=\begin{bmatrix} 0 & \tfrac{1}{2} & \tfrac{1}{2} & 0 \\ 1 & 0 & 0 & 1 \\ 0 & \tfrac{1}{2} & 0 & 0 \\ 0 & 0 & \tfrac{1}{2} & 0 \end{bmatrix}. $$
Finding the ranking. Every page can be reached from every other (the web is strongly connected) and $P^{6}$ has all positive entries, so $P$ is regular. By Theorem 2 the chain has a unique steady state $\overline{X}$, approached from any starting page — and that $\overline{X}$ is the importance score we want.
Solve $(P-I)\overline{X}=\mathbf{0}$ with $x_1+x_2+x_3+x_4=1$ to find the steady state. What is the page ranking for these 4 web pages?
Find the characteristic polynomial of $P$ by computing the determinant $|P - tI|$. Show all of your steps.
Factor the characteristic polynomial to find the roots of the characteristic polynomial. Show all of your steps.
Picture a token sitting on a node of a network. Each step, it moves along one of the edges leaving its current node, chosen at random. This is a Markov chain whose state diagram (§2.5.7) is the network. Take a triangle on nodes $1,2,3$ (every pair joined) with a fourth node attached to node $3$ by a single edge.
Building the matrix. From a node with $d$ neighbors, the token goes to each neighbor with probability $\tfrac{1}{d}$. Node $3$ touches nodes $1$, $2$, and $4$, so from node $3$ the token moves to each with probability $\tfrac{1}{3}$ — that is column $3$:
$$ \text{column } 3 = \begin{bmatrix} \tfrac{1}{3} \\ \tfrac{1}{3} \\ 0 \\ \tfrac{1}{3} \end{bmatrix}. $$
Doing this for all four nodes gives
$$ P=\begin{bmatrix} 0 & \tfrac{1}{2} & \tfrac{1}{3} & 0 \\ \tfrac{1}{2} & 0 & \tfrac{1}{3} & 0 \\ \tfrac{1}{2} & \tfrac{1}{2} & 0 & 1 \\ 0 & 0 & \tfrac{1}{3} & 0 \end{bmatrix}. $$
Long-run behavior. The triangle makes the walk regular, so a unique steady state exists. For an equal-neighbor random walk, the steady-state probability of a node is proportional to its degree.
Solve $(P-I)\overline{X}=\mathbf{0}$ with $x_1+x_2+x_3+x_4=1$ to find the steady state. Use the steady state to determine the relative frequency that each node will be visited on a random walk. Show all of your steps.
Find the characteristic polynomial using MATLAB.
>> format rat >> P = [ 0 1/2 1/3 0; 1/2 0 1/3 0; 1/2 1/2 0 1; 0 0 1/3 0 ] >> poly = charpoly(P)Find the roots of the characteristic polynomial using MATLAB.
>> roots(poly)We will learn how to find the steady state using MATLAB in chapter 7.
