Math 511: Linear Algebra
2.6 Applications
2.6.1 Cryptography¶
$$ \require{color} \definecolor{brightblue}{rgb}{.267, .298, .812} \definecolor{darkblue}{rgb}{0.0, 0.0, 1.0} \definecolor{palepink}{rgb}{1, .73, .8} \definecolor{softmagenta}{rgb}{.99,.34,.86} \definecolor{blueviolet}{rgb}{.537,.192,.937} \definecolor{jonquil}{rgb}{.949,.792,.098} \definecolor{shockingpink}{rgb}{1, 0, .741} \definecolor{royalblue}{rgb}{0, .341, .914} \definecolor{alien}{rgb}{.529,.914,.067} \definecolor{crimson}{rgb}{1, .094, .271} \def\ihat{\mathbf{\hat{\unicode{x0131}}}} \def\jhat{\mathbf{\hat{\unicode{x0237}}}} \def\khat{\mathrm{\hat{k}}} \def\tombstone{\unicode{x220E}} \def\contradiction{\unicode{x2A33}} $$
Information security is vitally important to individuals, companies and governments. Malicious use of private information has become the third largest economy after the U.S. and China. The U.S. Cybersecurity & Infrastructure Security Agency
estimates in the 26-Oct-2020 report "Cost of a Cyber Incident: Systematic Review and Cross-Validation"
One of the primary means of protecting electronic information is encryption. Encryption is the transformation of plain text or readable information into a non-readable form or cypher text. A cryptogram or cypher text is a message derived from plain text information that has been encrypted (transformed) by using a mathematical algorithm. The plain text in this case is a computer readable version of our information, however the relationship between human characters and the computer characters is a part of a public specification. This is not considered encrypted so much as encoded. Encoded information is simply represented in a new form. We can think of this as translating our characters into a machine readable format.
The American Standard Code for Information Interchange or ASCII is a common machine readable form of English text used in most simple text documents. However ASCII is a one byte (8 bit) coding so the number of characters it may represent is very limited.
The Unicode Standard is a representation capable of expressing most of the current world's writing systems, emoji, and non-visual and formatting codes. Unicode is a 2 byte (16 bit) code so the number it is capable of expressing 65,535 individual characters.
In ASCII the uppercase and lowercase characters are encoded by
ASCII Code | Symbol | ASCII Code | Symbol | ASCII Code | Symbol | ASCII Code | Symbol |
---|---|---|---|---|---|---|---|
32 | space | ||||||
65 | A | 97 | a | 78 | N | 110 | n |
66 | B | 98 | b | 79 | O | 111 | o |
67 | C | 99 | c | 80 | P | 112 | p |
68 | D | 100 | d | 81 | Q | 113 | q |
69 | E | 101 | e | 82 | R | 114 | r |
70 | F | 102 | f | 83 | S | 115 | s |
71 | G | 103 | g | 84 | T | 116 | t |
72 | H | 104 | h | 85 | U | 117 | u |
73 | I | 105 | i | 86 | V | 118 | v |
74 | J | 106 | j | 87 | W | 119 | w |
75 | K | 107 | k | 88 | X | 120 | x |
76 | L | 108 | l | 89 | Y | 121 | y |
77 | M | 109 | m | 90 | Z | 122 | z |
2.6.2 Encrypting a Message¶
For example, to encode the plain text "Meet me Monday" in ASCII would be
M | e | e | t | m | e | M | o | n | d | a | y | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
77 | 101 | 101 | 116 | 32 | 109 | 101 | 32 | 77 | 111 | 110 | 100 | 97 | 121 |
We need groups of 8 characters so we might use
M | e | e | t | m | e | ||
---|---|---|---|---|---|---|---|
77 | 101 | 101 | 116 | 32 | 109 | 101 | 32 |
M | o | n | d | a | y | ||
---|---|---|---|---|---|---|---|
77 | 111 | 110 | 100 | 97 | 121 | 32 | 32 |
We can write this to a "text" file as a sequence of two $1\times 8$ matrices
$$
\begin{bmatrix} 077 & 101 & 101 & 116 & 032 & 109 & 101 & 032 \end{bmatrix}
$$
$$
\begin{bmatrix} 077 & 111 & 110 & 100 & 097 & 121 & 032 & 032 \end{bmatrix}
$$
The means we need an $8\times 8$ matrix to encode our plain text strings. You can use this GeoGebra applet to create your $8\times 8$ matrix. Each person who uses the applet will generate a different Encoding Matrix $A$. This matrix is called a one-time pad. A one time pad creates secure cypher text, however both you and your recipient must have the one-time pad to make it work. This can be problematic.
Then enter your two plain text message row matrices as follows:
Now you can encrypt your row matrices into cypher text using vector-matrix multiplication.
$$ \begin{align*} y_1 &= m_1A \\ y_2 &= m_2A \end{align*} $$
Now you can send $1\times 8$ matrices $y_1$ and $y_2$ to your recipient securely. To decrypt the matrices, your recipient multiplies both sides of the equation above by $A^{-1}$ to obtain the readable plain text,
$$
\begin{align*}
r1 &= y_1A^{-1} = m_1AA^{-1} = m_1 \\
r2 &= y_2A^{-1} = m_2AA^{-1} = m_2
\end{align*}
$$
2.6.3 Linear Regression¶
If one has a set of data points such as
$x_i$ | $y_i$ | |
---|---|---|
1 | 1 | |
2 | 2 | |
3 | 4 | |
4 | 4 | |
5 | 6 |
We can plot these points on the plane and attempt to find a line that is *closest* to these points in the __least squared sense__. In other words, so that the total sum of the distance from each point to the line is a small as possible. The points are __non co-linear__. Since the Euclidean distance formula includes a square root, we square the distance and minimize the sum of squares. Methods for computing the best possible line are discussed in differential calculus and statistics courses.
The methods used by calculus and statistics and cumbersome and do not shed any light on how the line is produced. We will study the geometry of computing the least squared line or regression line in section 5.4. However for now we should look at how to set up the problem and solve it using methods from this chapter. If we plot the points
The equation of our line in slope-intercept form is given by
$$ y = mx + b $$
This yields five equations and two unknowns, an over determined linear system.
$$ \begin{align*} 1m + b &= 1 \\ 2m + b &= 2 \\ 3m + b &= 4 \\ 4m + b &= 4 \\ 5m + b &= 6 \end{align*} $$
Indeed since we already know that the points are not co-linear we conclude that this over-determined linear system has no solution. We must determine a closest solution for the slope and intercept for the regression line. In matrix form we have
$$ \begin{bmatrix} 1 & 1 \\ 2 & 1 \\ 3 & 1 \\ 4 & 1 \\ 5 & 1 \end{bmatrix}\begin{bmatrix} m \\ b \end{bmatrix} = \begin{bmatrix} 1 \\ 2 \\ 4 \\ 4 \\ 6 \end{bmatrix} $$
In section 5.4 we will discuss the geometry that results in our method of solution. We have the equation
$$ A\begin{bmatrix} m \\ b \end{bmatrix} = \mathbf{b} $$
We multiply both sides of the equation by $A^T$ to obtain the normal equation.
$$ \begin{align*} A\mathbf{x} &= \mathbf{b} \\ A^TA\mathbf{x} &= A^T\mathbf{b} \\ \\ \begin{bmatrix} 1 & 2 & 3 & 4 & 5 \\ 1 & 1 & 1 & 1 & 1 \end{bmatrix}\begin{bmatrix} 1 & 1 \\ 2 & 1 \\ 3 & 1 \\ 4 & 1 \\ 5 & 1 \end{bmatrix}\begin{bmatrix} m \\ b \end{bmatrix} &= \begin{bmatrix} 1 & 2 & 3 & 4 & 5 \\ 1 & 1 & 1 & 1 & 1 \end{bmatrix}\begin{bmatrix} 1 \\ 2 \\ 4 \\ 4 \\ 6 \end{bmatrix} \\ \\ \begin{bmatrix} 55 & 15 \\ 15 & 5 \end{bmatrix}\begin{bmatrix} m \\ b \end{bmatrix} &= \begin{bmatrix} 63 \\ 17 \end{bmatrix} \end{align*} $$
Now we can use Gauss-Jordan Elimination to reduce the augmented matrix.
$$ \begin{align*} \begin{bmatrix} 55 & 15 & | & 63 \\ 15 & 5 & | & 17 \end{bmatrix}\begin{array}{c} R_1-3R_2 \\ \ \end{array} &\rightarrow \begin{bmatrix} 10 & 0 & | & 12 \\ 15 & 5 & | & 17 \end{bmatrix}\begin{array}{c} \ \\ R_2 - R_1 \end{array} \rightarrow \begin{bmatrix} 10 & 0 & | & 12 \\ 5 & 5 & | & 5 \end{bmatrix}\begin{array}{c} \frac{R_1}{10} \\ \frac{R_2}{5} \end{array} \\ \\ &\rightarrow \begin{bmatrix} 1 & 0 & | & 1.2 \\ 1 & 1 & | & 1 \end{bmatrix}\begin{array}{c} \ \\ R_2 - R_1 \end{array}\rightarrow \begin{bmatrix} 1 & 0 & | & 1.2 \\ 0 & 1 & | & -0.2 \end{bmatrix} \end{align*} $$
Our regression line has the equation
$$ y = 1.2x - 0.2 $$
Your use of this self-initiated mediated course material is subject to our Creative Commons License 4.0