$Wichita State University Logo$

$Department Home Page$ Mathematics, Statistics & Physics

Math 511: Linear Algebra¶

2.5 Markov Chains¶

2.5.1 Markov Processes¶

$$ \require{color} \definecolor{brightblue}{rgb}{.267, .298, .812} \definecolor{darkblue}{rgb}{0.0, 0.0, 1.0} \definecolor{palepink}{rgb}{1, .73, .8} \definecolor{softmagenta}{rgb}{.99,.34,.86} \definecolor{blueviolet}{rgb}{.537,.192,.937} \definecolor{jonquil}{rgb}{.949,.792,.098} \definecolor{shockingpink}{rgb}{1, 0, .741} \definecolor{royalblue}{rgb}{0, .341, .914} \definecolor{alien}{rgb}{.529,.914,.067} \definecolor{crimson}{rgb}{1, .094, .271} \definecolor{indigo}{rgb}{.8, 0, .6} \def\ihat{\hat{\mmlToken{mi}[mathvariant="bold"]{ı}}} \def\jhat{\hat{\mmlToken{mi}[mathvariant="bold"]{ȷ}}} \def\khat{\hat{\mathsf{k}}} \def\tombstone{\unicode{x220E}} \def\contradiction{\unicode{x2A33}} \def\textregistered{^{\unicode{xAE}}} \def\trademark{^{\unicode{x54}\unicode{x4D}}} $$

Markov Processes and Stochastic processes are where linear algebra and statistics overlap. Markov processes can be found in many modern applications of matrices in the fields of thermodynamics, mechanics, physics, chemistry, economics, finance, signal processing, information theory, speech processing, and data science.

We will look at the basics of stochastic matrices in this course, however you will find many applications of these matrices including page-ranking algorithms.

2.5.2 States of a Population¶

Many applications involving populations or finance are represented by a finite number of conditions, associations, or labels. In mathematics if we have a population of consumers, a population of voters, or a population of fish, then we can associate each member of the population with an attribute or state.

Technically the state $s$ of each element of a population or group $G$ is the result of a function $f:G\rightarrow S$, where each member of the group $x\in G$ is mapped to a state $s\in S$ in the state space. Generally we are not interested in the function that associates each member of the population with an associated state. Instead we are interested in the per cent of the total population that share the same state.

For example,

a population of voters may be described as Democratic, Republican, Independent,or affiliated with another political party.
a population of consumers may purchase an Android$\trademark$, Apple$\trademark$, Samsung$\trademark$ or other brand smart phone.
a population of tuna may be described as Albacore, Allothunnus, Euthynnus, Frigate, or Skipjack.

In each case we want the relative size of each classification of the group. The relative size of the group is the fraction of the total size of the group represented by the classification or state. This can also represent the probability that a random member of the group fits is associated with a given state. Let's consider an

American Mathematical Society Blog about Markov Chains.

Example 1¶

Freshmen student's dating prospects can be modeled by a Markov Process. Suppose that students describe their $80\%$ of incoming freshmen describe their current dating status as single, $10\%$ identify as in a relationship, and $10\%$ describe their relationship state as an ambiguous state variously termed "it's complicated" , "hookup buddies", "friends with benefits", etc.

In this example the state space is $R = \left\{\, \text{single},\ \text{relationship},\ \text{complicated} \,\right\}$

The entire population of freshmen $F$ can be divided into three distinct sets by the relationship function $f:F\rightarrow R$, where

$$ r = f(x) $$

is the relationship state $r\in R$ of student $x\in F$. The population $F$ can be divided into three distinct sets.

The set of single students in $F$, $f^{-1}(\text{single})$
The set of students in $F$ currently in a relationship, $f^{-1}(\text{relationship})$
The set of students in $F$ in an ambiguous state, $f^{-1}(\text{complicated})$

Wait a minute! Why are we using an inverse?¶

In mathematics, the $-1$ in the exponent does not always mean inverse. Technically it means pre-image.

Definition¶

Pre-Image

For any function $f$ with domain $D$ and codomain $R$, or $f:D\rightarrow R$, the pre-image of an element on the codomain $y\in R$ is the set of all elements in the domain that $f$ maps to $y$.

$$ f^{-1}(y) = \left\{\, x\in D\,:\,y=f(x) \,\right\} $$

So $f^{-1}$ is a set of elements of the domain. If there are no elements of the domain mapped to $y\in R$ by $f$, $f^{-1}(y)=\emptyset$. In this way the pre-image is always well-defined. If it so happens that the pre-image of every element of the codomain is a nonempty singleton set, then we define an inverse function $f^{-1}:R\rightarrow D$. We even use the same notation to make everything more confusing.

The important statistic here is the relative size of each of the sets. If we denote $|P| = $ the number of elements of the set, then important statistics are

$\dfrac{|f^{-1}(\text{single})|}{|F|} = $ fraction of freshmen students that are single.
$\dfrac{|f^{-1}(\text{relationship})|}{|F|} = $ fraction of freshman students that are in a relationship.
$\dfrac{|f^{-1}(\text{complicated})|}{|F|} = $ fraction of freshman students in an ambiguous state.

The values of each ratio is a number in the interval $[0 1]$. This allows for the possibility that no part of the group $F$ is currently assigned a specific state, or that the entire population is currently assigned to the same state. The ratios are interpreted as the probabilites that a random member of the population will currently be assigned to each state. In our example the probability $\mathcal{Pr}\left( f(x)=s \right)$ that a random member of the population $x\in F$ is associated with state $s$ in the state space is given by

$\mathcal{Pr}\left( f(x)=\text{single} \right) := \dfrac{|f^{-1}(\text{single})|}{|F|}$
$\mathcal{Pr}\left( f(x)=\text{relationship} \right) := \dfrac{|f^{-1}(\text{relationship})|}{|F|}$
$\mathcal{Pr}\left( f(x)=\text{complicated} \right) := \dfrac{|f^{-1}(\text{complicated})|}{|F|}$

The notation is getting burdensome, so to simplify the notation and really make the notation difficult to understand at first we define a new symbol for our random student. We set in stone the order of the states in our state space.

$\text{single} = $ state 1, $s_1$,
$\text{relationship} = $ state 2, $s_2$,
$\text{complicated} = $ state 3, $s_3$

Then define for any random element of the population $x\in F$,

$p_1 := \mathcal{Pr}\left( f(x)=s_1 \right)$
$p_2 := \mathcal{Pr}\left( f(x)=s_2 \right)$
$p_3 := \mathcal{Pr}\left( f(x)=s_3 \right)$

Finally a state for the mathematical model or system is defined by the random variable

$$ \mathbf{X} = \begin{bmatrix}\ p_1\ \\ \ p_2\ \\ \ p_3\ \end{bmatrix} $$

In this way one may refer to a random state of the system or state vector. The sum of the elements of a state vector must add up to one, because we constructed our state space so that every member of the population was in one of the states. When constructing a state space, always define an other state like $\text{complicated}$ so that every member of the population is assigned exactly one state in the state space.

Exercise 1¶

In example 1, what is the state of the incoming freshmen on the first day of class?

View Solution

80% are single
10% are in a relationship
10% describe the relationship status as complicated

Hence the state of the system on the first day is $\mathbf{X} = \begin{bmatrix}\ .80\ \\ \ .10\ \\ \ .10\ \end{bmatrix}$.

Exercise 2¶

After one month $29.5\%$ of freshmen describe themselves as single, $25.5\%$ as in a relationship, and $45\%$ as, it's complicated. What is the state after one month?

View Solution

29.5% are single
25.5% are in a relationship
45% describe the relationship status as complicated

Hence the state of the system on the first day is $\mathbf{X} = \begin{bmatrix}\ .295\ \\ \ .255\ \\ \ .450\ \end{bmatrix}$.

The probabilities are the coordinates of the state vector.

Exercise 3¶

After three months the current state is given by $\mathbf{X} = \begin{bmatrix}\ .2824\ \\ \ .4008\ \\ \ .3169\ \end{bmatrix}$. What is the probability that a randomly selected freshman student will be single three months after start of term.

View Solution

The state $\text{single}$ is the first state in the state vector, so the probability that a randomly selected student will be single three months after start of term is $28.24\%$. This is denoted
$$ \mathcal{Pr}(\mathbf{X}(x)=s_1) = 28.24\% \text{ for any } x\in F $$

2.5.3 Stochastic Matrices¶

A Stochastic Matrix for a state space with $n$ elements is an $n\times n$ matrix. Each element is a transition probablity. To define terms in this section, a rather unusual notation is employed. Recall that for any vector $\mathbf{v}$ we use subscripts to define the elements of the vector. However, in a random variable in the state space $X$, the argument for the random variable $X$ is a random element of the population $x\in G$. To identify individual elements of the state vector we ask instead,

"What is the probability that a random member of the population $G$ will be in state $s = s_j\in S$?" This is denoted

$$ \mathcal{Pr}\left( \mathbf{X}(x)=s_j \right) = \mathcal{Pr}(\mathbf{X}=s_j) = \mathcal{Pr}(\mathbf{X}=s) $$

In a $Wikipedia article on Discrete Time Stochastic Processes$ Discrete Time Stochastic Process, we divide time into artificially created time steps, or cycles, $\Delta t$. The initial state is the state when $t=0$ and this state vector is denoted $X_0$.

At the $j^{\text{th}}$ time step the state of the stochastic model is denoted $\mathbf{X}_j$. The subscripts denote the number of time steps from zero. $X_j$ is the state vector of the stochastic model at time step $j$. That is to say, $j\Delta t$ units of time from the initial state. In this way the state of the sytem evolves throughout time.

This makes a transition probability an example of a $Wikipedia article on Conditional Probability$ conditional probability.

Definition¶

Transition Probablity

A transition probability $p_{ij}$ is the probability that a member of a population assigned to state $s_i$ in the previous time step, is currently assigned state $s_j$.

$$ p_{ij} = \mathcal{Pr}(\mathbf{X}_{n+1}=s_i\,|\,\mathbf{X}_n=s_j) $$

This says that $p_{ij}$ is the probability that a member of the population will change to state $s_i$ at the next time step, given that they are in state $s_j$ at time step $n$.

Example 1¶

After one month there is a 30% chance a single student will remain single, a 20% chance they will enter a relationship, and a 50% chance they will enter the complicated state. Notice that the sum of each column is 1 do denote that the entire population must end up in one of the finite states after each time step.

These probabilities that a freshman in one relationship category will end up in another category after a finite amount of time (one month) is called a transition probability $p_{ij}$. Arranging these transition probabilities into a column vector yields the probability vector of transition probabilities for each state

$$ \mathbf{p}_j = \begin{bmatrix}\ p_{1j}\ \\ p_{2j}\ \\ \ddots\ \\ \ p_{nj}\ \end{bmatrix} $$

Notice that the sum of the probabilities of each column vector must add up to 1 since we design the categories to cover the whole population.

$$ p_1j + p_2j + \cdots + p_nj = \sum_{i=1}^n p_{ij} = 1 $$

Definition¶

Probability Vector

A probability vector is an $n\times 1$ vector $\mathbf{x}$ whose entries all lie between 0 and 1, $x_j\in[0,1]$ for $1\le j\le n$, and whose entry sum is equal to 1.

$$ \sum_{j=1}^n x_j = 1 $$

An $n\times n$ matrix whose columns are the probability vectors for each state in a stochastic system is called a transition of probabilities matrix, stochastic matrix, or Markov matrix. It yields a matrix that can be used to compute the next state vector of a stochastic system from the current state vector.

$$ X_{n+1} = PX_n $$

The state of the stochastic system at each time step of the state is computed by applying the stochastic matrix to the current state.

$$ \begin{align*} X_1 &= PX_0 \\ X_2 &= PX_1 = P\left(PX_0\right) = P^2X_0 \\ X_3 &= PX_2 = P\left(P^2X_0\right) = P^3X_0 \\ &\ddots \\ X_n &= P^nX_0 \end{align*} $$

Equivalently, the state vector of the stochastic system at time step $n$ is computed by applying $P^n$ to the initial state. The phenomenon of stochastic system is called memoryless. The evolution of many applications require only knowledge of the current state of the system and the stochastic matrix to predict the next state. No knowledge of any of the previous states of the system are required. A Markov Chain is a description of the evolution of a memoryless stochastic system. Fortunately this is a very common attribute of physical systems.

Definition¶

Stochastic Matrix

An $n\times n$ matrix $P$ is called stochastic if all of the elements $p_{ij}\in[0,1]$, and the sum of the entries of each column vector of the matrix is equal to 1. That is for $1\le j\le n$,

$$ \sum_{i=1}^n p_{ij} = 1 $$

Example 1¶

The array of transition probabilites is given in the table below

$$ \begin{array}{cc} & \qquad\qquad\qquad From\ State\ j \\ & \qquad\qquad\qquad\overbrace{\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad } \\ To\ State\ i\ & \left\{ \begin{array}{lccc} & \text{Single} & \text{Relationship} & \text{Complicated} \\ \text{Single} & .30 & .30 & .25 \\ \text{Relationship} & .20 & .60 & .35 \\ \text{Complicated} & .50 & .10 & .40 \end{array} \right. \end{array} $$

Together these transition vectors $\mathbf{p}_j$ form a transition matrix, Markov matrix, or transition of probabilities matrix $P$.

$$ P = \begin{bmatrix}\ .30\ &\ .30\ &\ .25\ \\ \ .20\ &\ .60\ &\ .35\ \\ \ .50\ &\ .10\ &\ .40\ \end{bmatrix} $$

Suppose at the start of orientation 80% of freshman are single, 10% are in a relationship, and 10% are "it’s complicated".

$$ \mathbf{X}_0 = \begin{bmatrix} .80 \\ .10 \\ .10 \end{bmatrix} $$

Exercise 4¶

What percentage of freshman students will be in a relationship in a month?

View Solution

$$ \mathbf{X}_1 = P\mathbf{X}_0 = \begin{bmatrix}\ .30\ &\ .30\ &\ .25\ \\ \ .20\ &\ .60\ &\ .35\ \\ \ .50\ &\ .10\ &\ .40\ \end{bmatrix}\begin{bmatrix}\ .80\ \\ \ .10\ \\ \ .10\ \end{bmatrix} = \begin{bmatrix}\ .295\ \\ \ .255\ \\ \ .45\ \end{bmatrix} \\ $$
Approximately $25.5\%$ of the freshman population will be in a relationship one month after orientation.

2.5.4 Markov Chains¶

Definition¶

Markov Chain

A Markov Chain is a sequence of states $\left\{ X_0, X_1, X_2, \dots \right\} = \left\{ X_n \right\}$ that are related by the equation $X_{k+1} = PX_k$ for some stochastic matrix $P$.The $n^{\text{th}}$ state of the Markov Chain is given by

$$ X_n = P^nX_0 $$

The sequence of state vectors denotes the evolution of the state of the stochastic process over time.

(Notice that I have stopped using a bold font or vector decoration to denote the state vector $X$. I do this for historical reasons, and because, in the case of state vectors, Statisticians do not use the normal vector notation for the state vector. Some texts and Statisticians DO use vector notation! You will need to read carefully to recognize when a random variable refers to a vector or a scalar quantity.)

Exercise 5¶

How many freshman will be in a relationship after 3 months? (Use MATLAB, GeoGebra, Desmos, WolframAlpha, or your favorite software)

View Solution

$$ \begin{align*} X_3 &= P\left(P\left(P(X_0)\right)\right) = P^3X_0 \\ \\ &= \begin{bmatrix} .30 & .30 & .25 \\ .20 & .60 & .35 \\ .50 & .10 & .40 \end{bmatrix}^3\begin{bmatrix} .80 \\ .10 \\ .10 \end{bmatrix} \\ &= \begin{bmatrix} 0.2815 & 0.2875 & 0.284 \\ 0.3975 & 0.4195 & 0.408 \\ 0.3210 & 0.2930 & 0.308 \end{bmatrix}\begin{bmatrix} .80 \\ .10 \\ .10 \end{bmatrix} \\ \\ &\approx \begin{bmatrix} 0.28 \\ 0.40 \\ 0.32 \end{bmatrix} \end{align*} $$
Approximately $40\%$ of freshman will be in a relationship.

Exercise 6¶

How many students will be in a relationship at the beginning of their sophomore year?

View Solution

$$ \begin{align*} X_{12} &= P^{12}X_0 = \begin{bmatrix} .30 & .30 & .25 \\ .20 & .60 & .35 \\ .50 & .10 & .40 \end{bmatrix}^{12} \begin{bmatrix} .80 \\ .10 \\ .10 \end{bmatrix} \\ \\ &= \begin{bmatrix} 0.2847222199195 & 0.2847222240135 & 0.284722221966 \\ 0.4097222153115 & 0.4097222275975 & 0.409722221454 \\ 0.305555564769 & 0.305555548389 & 0.30555555658 \end{bmatrix}\begin{bmatrix} .80 \\ .10 \\ .10 \end{bmatrix} \\ \\ &\approx \begin{bmatrix} 0.28 \\ 0.41 \\ 0.31 \end{bmatrix} \end{align*} $$
Approximately $41\%$ of freshman will be in a relationship by the beginning of their sophmore year.

2.5.5 Steady State of a Markov Chain¶

Notice that in our first example, the values of the $X_3$ and $X_{12}$ are very close. In fact compute

$$ X_{13} = PX_{12} = \begin{bmatrix}\ 0.2847\dots\ \\ \ 0.4097\dots\ \\ \ 0.3056\dots\ \end{bmatrix} $$

In fact the values of $X_n$ are converging in the sense of calculus to

$$ \overline{X} \approx \begin{bmatrix}\ 0.284722\ \\ \ 0.409722\ \\ \ 0.305556\ \end{bmatrix} $$

This state $\overline{X}$ is called the steady state of the Markov Chain. It is called a steady state because the next state

$$ P\overline{X} = \overline{X} $$

is the same state.

Example 1¶

Find the steady state for the relationship state of incoming freshmen.

One needs to solve

$$ \begin{align*} PX &= X \qquad\text{or} \\ \\ PX - X &= O \\ \\ (P - I)X &= O \end{align*} $$

and since the steady state must be a transition probability vector,

$$ x_1 + x_2 + \cdots + x_n = 1 $$

Using the stochastic matrix for example 1:

$$ \begin{align*} \begin{bmatrix}\ .30\ &\ .30\ &\ .25\ \\ \ .20\ &\ .60\ &\ .35\ \\ \ .50\ &\ .10\ &\ .40\ \end{bmatrix}\begin{bmatrix}\ x_1\ \\ \ x_2\ \\ \ x_3\ \end{bmatrix} &= \begin{bmatrix}\ x_1\ \\ \ x_2\ \\ \ x_3\ \end{bmatrix} \\ \\ \left(\begin{bmatrix}\ .30\ &\ .30\ &\ .25\ \\ \ .20\ &\ .60\ &\ .35\ \\ \ .50\ &\ .10\ &\ .40\ \end{bmatrix} - \begin{bmatrix}\ 1\ &\ 0\ &\ 0\ \\ \ 0\ &\ 1\ &\ 0\ \\ \ 0\ &\ 0\ &\ 1\ \end{bmatrix}\right)\begin{bmatrix}\ x_1\ \\ \ x_2\ \\ \ x_3\ \end{bmatrix} &= O \\ \\ \begin{bmatrix} -.70\ &\ .30\ &\ .25\ \\ \ .20\ & -.40\ &\ .35\ \\ \ .50\ &\ .10\ & -.60\ \end{bmatrix}\begin{bmatrix}\ x_1\ \\ \ x_2\ \\ \ x_3\ \end{bmatrix} &= O \\ \\ x_1 + x_2 + x_3 &= 1 \\ \\ \begin{bmatrix} -.70\ &\ .30\ &\ .25 & | & 0\ \\ \ .20\ & -.40\ &\ .35 & | & 0\ \\ \ .50\ &\ .10\ & -.60 & | & 0\ \\ \ 1\ &\ 1\ &\ 1\ & | & 1\ \end{bmatrix}\begin{array}{l} R_4 \\ 10R_2 \\ 10R_3 \\ 10R_1 \\ \end{array} \rightarrow \begin{bmatrix} \ 1\ &\ 1\ &\ 1\ & | & 1\ \\ \ 2\ & -4\ &\ 3.5 & | & 0\ \\ \ 5\ &\ 1\ & -6 & | & 0\ \\ -7\ &\ 3\ &\ 2.5 & | & 0\ \end{bmatrix}\begin{array}{l} \\ R_2-2R_1 \\ R_3-5R_1 \\ R_4+7R_1 \\ \end{array} &\rightarrow \begin{bmatrix} \ 1\ &\ 1\ &\ 1\ & | &\ 1\ \\ \ 0\ & -6\ &\ 1.5 & | & -2\ \\ \ 0\ & -4\ & -11 & | & -5\ \\ \ 0\ &\ 10\ &\ 9.5 & | &\ 7\ \end{bmatrix}\begin{array}{l} \\ -R_3/4 \\ R_2 \\ \\ \end{array} \\ \\ \rightarrow \begin{bmatrix} \ 1\ &\ 1\ &\ 1\ & | &\ 1\ \\ \ 0\ &\ 1\ & 2.75 & | & 1.25\ \\ \ 0\ & -6\ &\ 1.5 & | & -2\ \\ \ 0\ &\ 10\ &\ 9.5 & | &\ 7\ \end{bmatrix}\begin{array}{l} \\ \\ R_3+6R_2 \\ R_4-10R_2 \\ \end{array} \rightarrow \begin{bmatrix} \ 1\ &\ 1\ &\ 1\ & | &\ 1\ \\ \ 0\ &\ 1\ & 2.75 & | & 1.25\ \\ \ 0\ &\ 0\ &\ 18 & | & 5.5\ \\ \ 0\ &\ 0\ & -18\ & | & -5.5\ \end{bmatrix}\begin{array}{l} \\ \\ \\ R_4+R_3 \\ \end{array} &\rightarrow \begin{bmatrix} \ 1\ &\ 1\ &\ 1\ & | &\ 1\ \\ \ 0\ &\ 1\ & \frac{11}{4} & | & \frac{5}{4}\ \\ \ 0\ &\ 0\ &\ 18 & | & \frac{11}{2}\ \\ \ 0\ &\ 0\ &\ 0\ & | &\ 0\ \end{bmatrix}\begin{array}{l} \\ \\ R_3/18 \\ \\ \end{array} \\ \\ \rightarrow \begin{bmatrix} \ 1\ &\ 1\ &\ 1\ & | &\ 1\ \\ \ 0\ &\ 1\ & \frac{11}{4} & | & \frac{5}{4}\ \\ \ 0\ &\ 0\ &\ 1 & | & \frac{11}{36}\ \\ \ 0\ &\ 0\ &\ 0\ & | &\ 0\ \end{bmatrix}\begin{array}{l} R_1-R_3 \\ R_2 - 11R_3/4 \\ \\ \\ \end{array} \rightarrow \begin{bmatrix} \ 1\ &\ 1\ &\ 0\ & | &\ \frac{25}{36}\ \\ \ 0\ &\ 1\ & 0 & | & \frac{59}{144}\ \\ \ 0\ &\ 0\ &\ 1 & | & \frac{11}{36}\ \\ \ 0\ &\ 0\ &\ 0\ & | &\ 0\ \end{bmatrix}\begin{array}{l} R_1-R_2 \\ \\ \\ \\ \end{array} &\rightarrow \begin{bmatrix} \ 1\ &\ 0\ &\ 0\ & | &\ \frac{41}{144}\ \\ \ 0\ &\ 1\ & 0 & | & \frac{59}{144}\ \\ \ 0\ &\ 0\ &\ 1 & | & \frac{11}{36}\ \\ \ 0\ &\ 0\ &\ 0\ & | &\ 0\ \end{bmatrix} \\ \\ &\approx \begin{bmatrix} \ 1\ &\ 0\ &\ 0\ & | & 0.2847\ \\ \ 0\ &\ 1\ & 0 & | & 0.4097\ \\ \ 0\ &\ 0\ &\ 1 & | & 0.3056\ \\ \ 0\ &\ 0\ &\ 0\ & | &\ 0\ \end{bmatrix} \end{align*} $$

The steady state of the Markov Process, or the state of the relationship state for incoming freshman after many months is

$$ \overline{X} = \begin{bmatrix}\ \frac{41}{144}\ \\ \ \frac{59}{144}\ \\ \ \frac{44}{144}\ \end{bmatrix} $$

Checking

$$ \frac{41}{144} + \frac{59}{144} + \frac{44}{144} = \frac{144}{144} \quad{\color{green}\large{\checkmark}} $$

2.5.6 Regular Stochastic Matrices¶

Definition¶

Absorbing State

An absorbing state $s_j$ is a state whose transition probability $p_{jj}=1$. This means that once a member of the group $x\in G$ is assigned state $s_j$, the member stays in the state and never transitions away from it. The state vector of an absorbing state has zeros for all of the other entries of the column vector because the sum of all of the transition probabilities must be 1.

Absorbing Markov Chain

An Absorbing Markov Chain has a Markov matrix that satisfies two properties

There is at least one absorbing state.
Every member of the group $x\in G$ can transition from a non-absorbing state to an absorbing state in a finite number of time steps.

Example 2¶

Find the steady state of the Markov Chain with stochastic matrix $P = \begin{bmatrix}\ .6\ &\ 0\ &\ 0\ \\ \ 0\ &\ 1\ &\ .5\ \\ \ .4\ &\ 0\ &\ .5\ \end{bmatrix}$

Solving the system of equations

$$ \begin{align*} PX &= X \\ x_1 + x_2 + x_3 &= 1 \\ \end{align*} $$

yields

$$ \begin{align*} \left(\begin{bmatrix}\ .6\ &\ 0\ &\ 0\ \\ \ 0\ &\ 1\ &\ .5\ \\ \ .4\ &\ 0\ &\ .5\ \end{bmatrix} - \begin{bmatrix}\ 1\ &\ 0\ &\ 0\ \\ \ 0\ &\ 1\ &\ 0\ \\ \ 0\ &\ 0\ &\ 1\ \end{bmatrix} \right)X &= O \\ \\ \begin{bmatrix} -.4\ &\ 0\ &\ 0\ \\ \ 0\ &\ 0\ &\ .5\ \\ \ .4\ &\ 0\ & -.5\ \end{bmatrix}X &= O \\ x_1 + x_2 + x_3 &= 1 \\ \\ \begin{bmatrix} -.4\ &\ 0\ &\ 0 & | & 0\ \\ \ 0\ &\ 0\ &\ .5 & | & 0\ \\ \ .4\ &\ 0\ & -.5 & | & 0\ \\ \ 1\ &\ 1\ &\ 1\ & | & 1\ \end{bmatrix}\begin{array}{l} R_4 \\ \\ \\ R_1 \\ \end{array} &\rightarrow \begin{bmatrix}\ 1\ &\ 1\ &\ 1\ & | & 1\ \\ \ 0\ &\ 0\ &\ .5 & | & 0\ \\ \ .4\ &\ 0\ & -.5 & | & 0\ \\ -.4\ &\ 0\ &\ 0 & | & 0\ \end{bmatrix}\begin{array}{l} \\ 10R_2 \\ 10R_3 \\ 10R_4 \\ \end{array} \rightarrow \begin{bmatrix}\ 1\ &\ 1\ &\ 1\ & | & 1\ \\ \ 0\ &\ 0\ &\ 5 & | & 0\ \\ \ 4\ &\ 0\ & -5 & | & 0\ \\ -4\ &\ 0\ &\ 0 & | & 0\ \end{bmatrix}\begin{array}{l} \\ \\ R_3+R_4 \\ R_4+4R_1 \\ \end{array} \\ \\ \rightarrow \begin{bmatrix}\ 1\ &\ 1\ &\ 1\ & | & 1\ \\ \ 0\ &\ 0\ &\ 5 & | & 0\ \\ \ 0\ &\ 0\ & -5 & | & 0\ \\ \ 0\ &\ 4\ &\ 4 & | & 4\ \end{bmatrix}\begin{array}{l} \\ \\ R_3+R_2 \\ R_4/4 \\ \end{array} &\rightarrow \begin{bmatrix}\ 1\ &\ 1\ &\ 1\ & | & 1\ \\ \ 0\ &\ 0\ &\ 5 & | & 0\ \\ \ 0\ &\ 0\ &\ 0 & | & 0\ \\ \ 0\ &\ 1\ &\ 1 & | & 1\ \end{bmatrix}\begin{array}{l} \\ R_4 \\ R_2/5 \\ R_3 \\ \end{array} \rightarrow \begin{bmatrix}\ 1\ &\ 1\ &\ 1\ & | & 1\ \\ \ 0\ &\ 1\ &\ 1 & | & 1\ \\ \ 0\ &\ 0\ &\ 1 & | & 0\ \\ \ 0\ &\ 0\ &\ 0 & | & 0\ \end{bmatrix}\begin{array}{l} R_1-R_3 \\ R_2-R_3 \\ \\ \\ \end{array} \\ \\ \rightarrow \begin{bmatrix}\ 1\ &\ 1\ &\ 0\ & | & 1\ \\ \ 0\ &\ 1\ &\ 0 & | & 1\ \\ \ 0\ &\ 0\ &\ 1 & | & 0\ \\ \ 0\ &\ 0\ &\ 0 & | & 0\ \end{bmatrix}\begin{array}{l} R_1-R_2 \\ \\ \\ \\ \end{array} &\rightarrow \begin{bmatrix}\ 1\ &\ 0\ &\ 0\ & | & 0\ \\ \ 0\ &\ 1\ &\ 0 & | & 1\ \\ \ 0\ &\ 0\ &\ 1 & | & 0\ \\ \ 0\ &\ 0\ &\ 0 & | & 0\ \end{bmatrix} \end{align*} $$

Eventually the entire group will be in state 2, $s_2$.

Definition¶

Regular Stochastic Matrix

A stochastic matrix $P$ is called regular if and only if all of the entries of some integer power of $P$ are positive. This implies that the stochastic process has no absorbing state.

Example 1¶

The stochastic matrix in example 1, $P = \begin{bmatrix}\ .30\ &\ .30\ &\ .25\ \\ \ .20\ &\ .60\ &\ .35\ \\ \ .50\ &\ .10\ &\ .40\ \end{bmatrix}$ is a regular stochastic matrix.

Example 2¶

The stochastic matrix in example 2, $P = \begin{bmatrix}\ .6\ &\ 0\ &\ 0\ \\ \ 0\ &\ 1\ &\ .5\ \\ \ .4\ &\ 0\ &\ .5\ \end{bmatrix}$ is a not a regular stochastic matrix.

Theorem 2.5.1¶

The Markov Chain of a regular stochastic matrix

$$ \left\{ X_0, PX_0, P^2X_0, P^3X_0, \dots, P^nX_0, \dots \right\} $$

approaches a steady state

$$ \overline{X} = \displaystyle\lim_{n\rightarrow\infty} P^nX_0 $$

Example 1¶

The Markov Chain described in example 1 has a steady state.

Example 2¶

The Markov Chain described in example 2 has a steady state.

Theorem 2.5.1 says that the Markov Chain resulting from a regular stochastic matrix has a steady state. It says nothing about stochastic processes with an absorbing state.

Example 3¶

Consider the Markov Process with stochastic matrix $\begin{bmatrix}\ 0.4\ &\ 0\ &\ 0.2\ &\ 0\ \\ \ 0.2\ &\ 1\ &\ 0.3\ &\ 0\ \\ \ 0.2\ &\ 0\ &\ 0.4\ &\ 0\ \\ \ 0.2\ &\ 0\ &\ 0.1\ &\ 1\ \end{bmatrix}$. Find the steady state of the Markov Chain.

$$ \begin{align*} \left( \begin{bmatrix}\ 0.4\ &\ 0\ &\ 0.2\ &\ 0\ \\ \ 0.2\ &\ 1\ &\ 0.3\ &\ 0\ \\ \ 0.2\ &\ 0\ &\ 0.4\ &\ 0\ \\ \ 0.2\ &\ 0\ &\ 0.1\ &\ 1\ \end{bmatrix} - \begin{bmatrix}\ 1\ &\ 0\ &\ 0\ &\ 0\ \\ \ 0\ &\ 1\ &\ 0\ &\ 0\ \\ \ 0\ &\ 0\ &\ 1\ &\ 0\ \\ \ 0\ &\ 0\ &\ 0\ &\ 1\ \end{bmatrix} \right)X &= O \\ \\ \begin{bmatrix} -0.6\ &\ 0\ &\ 0.2\ &\ 0\ \\ \ 0.2\ &\ 0\ &\ 0.3\ &\ 0\ \\ \ 0.2\ &\ 0\ & -0.6\ &\ 0\ \\ \ 0.2\ &\ 0\ &\ 0.1\ &\ 0\ \end{bmatrix}X &= O \\ \\ x_1 + x_2 + x_3 + x_4 &= 1 \\ \\ \begin{bmatrix} -0.6\ &\ 0\ &\ 0.2\ &\ 0 & | & 0\ \\ \ 0.2\ &\ 0\ &\ 0.3\ &\ 0 & | & 0\ \\ \ 0.2\ &\ 0\ & -0.6\ &\ 0 & | & 0\ \\ \ 0.2\ &\ 0\ &\ 0.1\ &\ 0 & | & 0\ \\ \ 1\ &\ 1\ &\ 1\ &\ 1 & | & 1\ \end{bmatrix}\begin{array}{l} R_5 \\ 10R_2 \\ 10R_3 \\ 10R_4 \\ 10R_1 \end{array} \rightarrow \begin{bmatrix} \ 1\ &\ 1\ &\ 1\ &\ 1 & | & 1\ \\ \ 2\ &\ 0\ &\ 3\ &\ 0 & | & 0\ \\ \ 2\ &\ 0\ & -6\ &\ 0 & | & 0\ \\ \ 2\ &\ 0\ &\ 1\ &\ 0 & | & 0\ \\ -6\ &\ 0\ &\ 2\ &\ 0 & | & 0\ \end{bmatrix}\begin{array}{l} \\ R_2-2R_1 \\ R_3-2R_1 \\ R_4-2R_1 \\ R_5+6R_1 \end{array} &\rightarrow \begin{bmatrix} \ 1\ &\ 1\ &\ 1\ &\ 1 & | &\ 1\ \\ \ 0\ & -2\ &\ 1\ & -2\ & | & -2\ \\ \ 0\ & -2\ & -8\ & -2 & | & -2\ \\ \ 0\ & -2\ & -1\ & -2 & | & -2\ \\ \ 0\ &\ 6\ &\ 8\ &\ 6 & | & 6\ \end{bmatrix}\begin{array}{l} \\ -R_3/2 \\ R_2 \\ \\ R_5+3R_4 \end{array} \\ \\ \rightarrow \begin{bmatrix} \ 1\ &\ 1\ &\ 1\ &\ 1 & | &\ 1\ \\ \ 0\ & -2\ &\ 1\ & -2\ & | & -2\ \\ \ 0\ & -2\ & -8\ & -2 & | & -2\ \\ \ 0\ & -2\ & -1\ & -2 & | & -2\ \\ \ 0\ &\ 6\ &\ 8\ &\ 6 & | & 6\ \end{bmatrix}\begin{array}{l} \\ -R_3/2 \\ R_2 \\ \\ R_5+3R_4 \end{array} \rightarrow \begin{bmatrix} \ 1\ &\ 1\ &\ 1\ &\ 1 & | &\ 1\ \\ \ 0\ &\ 1\ &\ 4\ &\ 1 & | &\ 1\ \\ \ 0\ & -2\ &\ 1\ & -2 & | & -2\ \\ \ 0\ & -2\ & -1\ & -2 & | & -2\ \\ \ 0\ &\ 6\ &\ 8\ &\ 6 & | & 6\ \end{bmatrix}\begin{array}{l} \\ \\ R_3+2R_2 \\ R_4+2R_2 \\ R_5-6R_2 \end{array} &\rightarrow \begin{bmatrix} \ 1\ &\ 1\ &\ 1\ &\ 1 & | &\ 1\ \\ \ 0\ &\ 1\ &\ 4\ &\ 1 & | &\ 1\ \\ \ 0\ &\ 0\ &\ 9\ &\ 0 & | &\ 0\ \\ \ 0\ &\ 0\ &\ 7\ &\ 0 & | & 0\ \\ \ 0\ &\ 0\ & -16\ &\ 0 & | &\ 0\ \end{bmatrix}\begin{array}{l} \\ \\ R_3/9 \\ R_4/7 \\ -R_5/16 \end{array} \\ \\ \rightarrow \begin{bmatrix} \ 1\ &\ 1\ &\ 1\ &\ 1 & | &\ 1\ \\ \ 0\ &\ 1\ &\ 4\ &\ 1 & | &\ 1\ \\ \ 0\ &\ 0\ &\ 1\ &\ 0 & | &\ 0\ \\ \ 0\ &\ 0\ &\ 1\ &\ 0 & | & 0\ \\ \ 0\ &\ 0\ &\ 1\ &\ 0 & | &\ 0\ \end{bmatrix}\begin{array}{l} R_1-R_3 \\ R_2-4R_3 \\ \\ R_4-R_3 \\ R_5-R_3 \end{array} \rightarrow \begin{bmatrix} \ 1\ &\ 0\ &\ 0\ &\ 0 & | &\ 0\ \\ \ 0\ &\ 1\ &\ 0\ &\ 1 & | &\ 1\ \\ \ 0\ &\ 0\ &\ 1\ &\ 0 & | &\ 0\ \\ \ 0\ &\ 0\ &\ 0\ &\ 0 & | & 0\ \\ \ 0\ &\ 0\ &\ 0\ &\ 0 & | &\ 0\ \end{bmatrix}\begin{array}{l} R_1-R_2 \\ \\ \\ \\ \\ \end{array} \\ \\ x_4 &= t\in\mathbb{R} \\ x_3 &= 0 \\ x_2+t &= 1 \\ x_2 &= 1-t \\ x_1 &= 0 \end{align*} $$

There are infinitely many steady state solutions, $\begin{bmatrix}\ 0\ \\ 1-t\ \\ \ 0\ \\ \ t\ \end{bmatrix} = \begin{bmatrix}\ 0\ \\ \ 1\ \\ \ 0\ \\ \ 0\ \end{bmatrix} + \begin{bmatrix}\ 0\ \\ -1\ \\ \ 0\ \\ \ 1\ \end{bmatrix}t$, $t\in[0,1]$. The solution indicates that whatever portion of the population ends up in state 4, the rest of the population remains in state 2.

2.5.7 State Diagrams¶

Definition¶

State Diagram

A $Wikipedia article on Directed Graphs$ Directed Graph or $MIT Open Courseware Computer Science Notes on Directed Graphs$ digraph is a collection of nodes and directed edges:

The nonempty set of nodes includes one node for each state of the stochastic process
The nonempty set of directed edges represent the transition probabilities $p_{ij}$ of the stochastic matrix

In a directed graph, the edges are curves connecting two nodes that has a direction indicated by an arrowhead indicating transition from one state to another state. There are three types of edges.

An in arrow is a directed edge that originate at another node and terminates at the target arrow. The arrowhead points at the target node or state.
An out arrow is a directed edge that originates at the target node and terminates at some other node in the graph. The arrowhead points from the target node away to some other node.
A loop arrow originates and terminates at the same node.

If we designate $S$ as the set of states or nodes, and $E$ as the set of directed edges, then the graph $\Gamma = (S, E)$ is the collection of both the states and the directed edges. This graph $\Gamma$ is the state diagram for a stochastic process.

Example 1¶

Figure 1 - State Diagram for Example 1

A state graph should label each node with the state it represents. Each directed edge should represent the transition probablity $p_{ij} = \mathcal{Pr}(\mathbf{X}_{n+1}=s_i\,|\,\mathbf{X}_n=s_j)$. This means that the directed edge should originate from vertex $s_j$ and terminate at vertex $s_i$. When confronted with a word problem, or a spread sheet of transition probabilities, create the state diagram first. This will greatly aid in creating and verifying the stochastic matrix based on the data.

Exercise 7¶

Create the state diagram for the stochastic matrix in example 2,

$$ P = \begin{bmatrix}\ .6\ &\ 0\ &\ 0\ \\ \ 0\ &\ 1\ &\ .5\ \\ \ .4\ &\ 0\ &\ .5\ \end{bmatrix} $$

View Solution

Figure 2 - State Diagram for Example 2

Google PageRank Algorithm¶

The original $A Wiki Article on the Google PageRank Algorithm$ PageRank algorithm used by Google$\trademark$ creates a state diagram. The vertices are web pages, and the edges are weighted with the number and quality of links from page $s_j$ to page $s_i$. With the total value of all links it computes the transition probability for each edge, the probability vectors, and the Markov matrix. The quality of a web page is the number of links to the web page. Then the algorithm computes the probability eigenvectors of the Markov matrix. We will study eigenvectors in chapter 7.

Exercise 7¶

Create the stochastic matrix from the state diagram,

Figure 3 - State Diagram

View Solution

$$ P = \begin{bmatrix}\ .1\ &\ .01\ &\ .3\ \\ .2\ &\ .99\ &\ .3\ \\ \ .7\ &\ 0\ &\ .4\ \end{bmatrix} $$

$Wichita State University Logo$

$Department Home Page$ Mathematics, Statistics & Physics

Your use of this self-initiated mediated course material is subject to our¶

$An international nonprofit organization that empowers people to grow and sustain the thriving commons of shared knowledge and culture.$ Creative Commons License 4.0