Math 451: Computational Mathematics

3.1 Numerical Data Types

3.1.1 Storing Numbers in a Digital Computer ¶

Modern digital computers store integers in base 2, or as a binary number . Digital computers have only two numerals to represent integers

$$ 0\text{ and } 1 $$
Counting the positive integers in binary looks like

$$ \begin{align*} 0 &= 0000\ 0000b\qquad &8 &= 0000\ 1000b \\ 1 &= 0000\ 0001b\qquad &9 &= 0000\ 1001b \\ 2 &= 0000\ 0010b\qquad &10 &= 0000\ 1010b \\ 3 &= 0000\ 0011b\qquad &11 &= 0000\ 1011b \\ 4 &= 0000\ 0100b\qquad &12 &= 0000\ 1100b \\ 5 &= 0000\ 0101b\qquad &13 &= 0000\ 1101b \\ 6 &= 0000\ 0110b\qquad &14 &= 0000\ 1110b \\ 7 &= 0000\ 0111b\qquad &15 &= 0000\ 1111b \\ \end{align*} $$

Binary is a place based numbering system just like familiar base 10. Given a base $\beta$, and digits $0,\ 1,\ ...\ \beta-1$ We can create a numbering system with that base

$$ \begin{align*} 0 &= 0 \\ 1 &= 1 \\ \ddots &= \ddots \\ \beta-1 &= \beta -1 \\ \beta &= 10 = 1\cdot\beta + 0 \\ \beta+1 &= 11 = 1\cdot\beta + 1 \\ \ddots &= \ddots \\ \beta^2 &= 100 = 1\cdot\beta^2 + 0\cdot\beta + 0 \\ \beta^2+1 &= 101 = 1\cdot\beta^2 + 0\cdot\beta + 1 \\ \ddots &= \ddots \end{align*} $$
For example in base 5, our number system looks like

$$ \begin{align*} 0 &= 0_5\qquad &5 &= 10_5\qquad &10 &= 20_5 = 2*5 + 0 \\ 1 &= 1_5\qquad &6 &= 11_5\qquad &11 &= 21_5 = 2*5 + 1 \\ 2 &= 2_5\qquad &7 &= 12_5\qquad &12 &= 22_5 = 2*5 + 2\\ 3 &= 3_5\qquad &8 &= 13_5\qquad &13 &= 23_5 = 2*5 + 3\\ 4 &= 4_5\qquad &9 &= 14_5\qquad &14 &= 24_5 = 2*5 + 4 \\ \end{align*} $$
Normally we do not decorate integers in base 10, only integers using another base.

Example 1 ¶

Represent the base 10 number 5280 in base 5.

To determine the number in base 5 we need to know how many

ones $=5^0$,
fives $=5^1$,
twenty fives $=5^2$,
one hundred, twenty fives $=5^3$,
$625$'s $=5^4$, and
$3125$'s $=5^5$ divide into 5280.

$$ \begin{align*} \dfrac{5280}{3125} &= \color{red}{1} \text{ remainder } \color{magenta}{2155} \\ \dfrac{\color{magenta}{2155}}{625} &= {\color{red}3} \text{ remainder } \color{magenta}{280} \\ \dfrac{\color{magenta}{280}}{125} &= \color{red}{2} \text{ remainder } \color{magenta}{30} \\ \dfrac{\color{magenta}{30}}{25} &= \color{red}{1} \text{ remainder } \color{magenta}{5} \\ \dfrac{\color{magenta}{5}}{5} &= \color{red}{1} \text{ remainder } \color{magenta}{0} \\ \dfrac{\color{magenta}{0}}{1} &= \color{red}{0} \\ \\ 5280 &= 132,110_5 \end{align*} $$

3.1.2 The Binary Numbering System ¶

Binary numbers use radix 2, and utilize only two digits, $0$ and $1$. It is also customary to follow the digits of a binary number with the letter 'b' rather than use the subscript $2$. For example

$$ 15 = 8 + 4 + 3 + 1 = 1\cdot2^3 + 1\cdot2^2 + 1\cdot2^1 + 1\cdot2^0 = 1111b $$
Binary numbering are easy numbers to create from the decimal equivalents, and easy to convert to base 10. However strings of binary numbers are usually tediously long. Computers don't mind.

Example 2 ¶

Represent the following numbers in binary

$$ \begin{align*} 13 &= \\ 27 &= \\ 88 &= \\ 1760 &= \\ \end{align*} $$
We need to divide by powers of 2 and note the quotients and remainders

$$ \begin{align*} 13 &= 8 + 5 = 8 + 4 + 1 = 8 + 4 + 0*2 + 1 = 1101b \\ \\ 27 &= 16 + 11 = 16 + 8 + 3 = 16 + 8 + 2 + 1 = 16 + 8 + 0*4 + 2 + 1 = 11011b \\ \\ 88 &= 64 + 24 = 64 + 16 + 8 = 64 + 0*32 + 16 + 8 + 0*4 + 0*2 + 0*1 = 1011000b \end{align*} $$

Example 3 ¶

Convert the following binary numbers into base 10 numbers

$$ \begin{align*} 10111b &= \\ 1010101b &= \\ 1101011b &= \end{align*} $$

$$ \begin{align*} 10111b &= 2^4 + 2^2 + 2 + 1 = 16 + 8 + 2 + 1 = 24 + 3 = 27 \\ \\ 1010101b &= 2^7 + 2^5 + 2^3 + 1 = 128 + 32 + 8 + 1 = 160 + 9 = 169 \\ \\ 1101011b &= 2^7 + 2^6 + 2^4 + 2 + 1 = 128 + 64 + 16 + 2 + 1 = 192 + 19 = 211 \end{align*} $$

3.1.3 Integer Data Types ¶

Integers are stored in digital computers in a variety of formats. This is due to the evolution of the integer data type over time, the needs of programmers over time, and growing capabilities of digital computers. The earliest digital electronic computers utilized groups of eight (8) binary digits called bits . Werner Buchholz coined the term byte for eight (8) contiguous binary digits in 1956. The byte is still the basic unit of digital information in most modern digital computers. The text you are reading on this page are represented in a byte format called ASCII . Your diary files are stored in ASCII format. However a single byte has sever limitations in modern computing. We can only store positive integers between

$$ \text{zero } = (0000\ 0000b)\text{ and }255 = 1111\ 1111b $$
Traditionally the binary digits in a byte are grouped into two 4 bit groups called a nibble . A nibble is important as we will see in the next section.

In order to efficiently store larger integers and negative integers computers were designed to utilize more bits.

bits	name	$\text{range}\ \ \ \qquad\qquad\qquad\qquad$
4	nibble	$0-15 = 2^4-1$
8	byte	$0-255 = 2^8-1$
16	word	$0-65,535 = 2^{16}-1$
32	double word (DWORD)	$0-4,294,967,295 = 2^{32}-1$
64	quad word (QWORD)	$0-18,446,744,073,709,551,615 = 2^{64}-1$

These are called unsigned integers because they utilize all bits for the magnitude of the integer. Signed integers use half of the bits to represent negative numbers. They are stored in an unusual format called two's complement representation.

3.1.4 One's Complement ¶

The most commonly used integer data types are signed integers. The default integer data type in MATLAB is a 32 bit signed integer. The default integer data type in MATLAB is 32 bits for backward compatibility with 32-bit personal computers. MATLAB was first released in 1984 when 16-bit personal computers were first coming to the market. The IBM-PC was released in 1981 and it utilized a 16-bit CPU. Although 32-bit processors were released by Intel in 1985, it would be 1987 before mainstream use of 32-bit operating systems became common. Many 64-bit architectures were conceived in the 1990s however they were not common in personal computers until 2003 . In spite of the shift to 64-bit personal computers in the past decade, many 32-bit systems are still available.

In order to store a signed integer we must determine a way to represent both positive and negative integers in the same 32 bits. We have a finite number of bits to use and these strings of binary digits encompass all of the numbers we can represent. These are our numerals .

$$ \begin{align*} 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000 &\ \\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0001 &\ \\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0010 &\ \\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0011 &\ \\ \ddots \\ 1111\ 1111\ 1111\ 1111\ 1111\ 1111\ 1111\ 1111\ &\ \end{align*} $$
Now we must decide what (if any) numbers these numerals represent. The mapping or bijection from this set of binary numerals to the numbers or objects they represent is called a binary representation . Creating a data structure boils down to deciding on

a binary representation of numbers or objects

a set operations we can perform on the numbers or objects

implementing the operations utilizing the representation

Traditionally we break the set of binary numerals in half and use half of them to represent negative integers and half of them to represent non-negative integers. In binary it is easy to spot the half-way mark in this collection of binary strings.

$$ \begin{align*} 1000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ &\ \end{align*} $$
At this point we need to introduce an new operation, two's complement and one's complement. The one's complement of a binary string is the binary string that reverses the digit or logic of each bit in the string. Ones and zeros can represent true and false as well as one and zero. The logical complement of true is false, and the logical complement of false is true. Let us consider the one' complement of a nibble (4 bits).

$$ \begin{align*} 0000 &\rightarrow 1111 \qquad\qquad &1000 &\rightarrow 0111 \\ 0001 &\rightarrow 1110 \qquad\qquad &1001 &\rightarrow 0110 \\ 0010 &\rightarrow 1101 \qquad\qquad &1010 &\rightarrow 0101 \\ 0011 &\rightarrow 1100 \qquad\qquad &1011 &\rightarrow 0100 \\ 0100 &\rightarrow 1011 \qquad\qquad &1100 &\rightarrow 0011 \\ 0101 &\rightarrow 1010 \qquad\qquad &1101 &\rightarrow 0010 \\ 0110 &\rightarrow 1001 \qquad\qquad &1110 &\rightarrow 0001 \\ 0111 &\rightarrow 1000 \qquad\qquad &1111 &\rightarrow 0000 \\ \end{align*} $$
Notice that the complement of the complement is the original binary string. The symmetry lead to computer architectures that used one's complement binary strings to represent negative numbers. However we will see that there are advantages to using two's complement instead.

3.1.5 Two's Complement ¶

The two's complement of a binary string is the one's complement of the string plus 1. So the two's complement of $0000$ is computed

$$ \begin{array}{ccc} 0000 & \rightarrow & 1111 \\ & + & 0001 \\ & & \rule[2pt]{2em}{.5pt} \\ & & 0000 \end{array} $$
This results because when we add the one's digits we obtain $1 + 1 = 10$, so we write down our zero and carry the $1$. Now in the tens columns we have $1 + 1 + 0 = 10$, so we write down our zero and carry the $1$. Eventually we might think we have

$$ 1\ 0000 $$

however our nibble only has 4 bits to work with to the last carry bit is lost to the binary string resulting in $0000$. Let us compute the two's complement of $0001$. This results in

$$ \begin{array}{ccc} 0001 & \rightarrow & 1110 \\ & + & 0001 \\ & & \rule[2pt]{2em}{.5pt} \\ & & 1111 \end{array} $$

The two's complements of our nibbles are

$$ \begin{align*} 0000 &\rightarrow 0000 \qquad\qquad &1000 &\rightarrow 1000 \\ 0001 &\rightarrow 1111 \qquad\qquad &1001 &\rightarrow 0111 \\ 0010 &\rightarrow 1110 \qquad\qquad &1010 &\rightarrow 0110 \\ 0011 &\rightarrow 1101 \qquad\qquad &1011 &\rightarrow 0101 \\ 0100 &\rightarrow 1100 \qquad\qquad &1100 &\rightarrow 0100 \\ 0101 &\rightarrow 1011 \qquad\qquad &1101 &\rightarrow 0011 \\ 0110 &\rightarrow 1010 \qquad\qquad &1110 &\rightarrow 0010 \\ 0111 &\rightarrow 1001 \qquad\qquad &1111 &\rightarrow 0001 \\ \end{align*} $$

Notice the same symmetry, the two's complement of the two's complement is the original binary string. Moreover the two's complement of $0000$ is $0000$. Moreover if we add $0001$ to $1111$ we obtain

$$ \begin{array}{ccc} & & 1111 \\ & + & 0001 \\ & & \rule[2pt]{2em}{.5pt} \\ & & 0000 \end{array} $$
Just as in the last example we add $1 + 1 = 10$ four times with three carry's. However the last carry bit is lost as we only have one nibble. Since $1111\ +\ 0001\ =\ 0000$, we represent the number $-1$ with the numeral $1111$. In 32 bits we have the two's complement of 1 given by

$$ \begin{array}{ccc} 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0001 & \rightarrow & 1111\ 1111\ 1111\ 1111\ 1111 1111\ 1111\ 1110 \\ & + & 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0001 \\ & & \rule[2pt]{16em}{.5pt} \\ & & 1111\ 1111\ 1111\ 1111\ 1111 1111\ 1111\ 1111 \end{array} $$

Thus we have

$$ \begin{align*} 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0001 &= 1 \\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000 &= 0 \\ 1111\ 1111\ 1111\ 1111\ 1111\ 1111\ 1111\ 1111 &= -1 \end{align*} $$

We call this representation of negative integers using the two's complement of the positive integer sign extension . Signed integers will have the same range of possible numbers that they can represent, however half of these numbers are negative and half are non-negative.

bits	name	$\text{range}\ \ \ \qquad\qquad\qquad\qquad$
8	byte	$-128 \text{ to } 127 = -2^7 \text{ to } 2^7-1$
16	word	$-32,768 \text{ to } 32,767 = 2^{15} \text{ to } 2^{15}-1$
32	double word (DWORD)	$−2,147,483,648 \text{ to } 2,147,483,647, = -2^{31} \text{ to } 2^{31}-1$
64	quad word (QWORD)	$−9,223,372,036,854,775,808 \text{ to } 9,223,372,036,854,775,807 = 2^{63} \text{ to } 2^{63}-1$

It is important to note that only a finite set of numbers can be represented. There are limitations to what you can compute in MATLAB. There exist techniques for overcoming the limitations of these integer data types. Some of them are available in the MATLAB environment. These techniques have their own limitations. We want to build upon our knowledge until we understand these techniques and evaluate their properties.

3.1.6 Adding and Subtracting Signed Integers ¶

The purpose of two's complement sign extension is to allow us to add and subtract integers within the framework of the integer data type we choose.

Example 4 ¶

Compute the result of each expression in 8-bit two's complement binary instead of base 10.

9 + 84
-81 + 72
92 - 59
-69 + 31
-3 + 87

We must first convert each term to binary so that we can add them. Our first step is to convert the absolute value of each term to 8-bit binary.

$$ \begin{align*} 9 &= 0000\ 1001_2 \qquad\qquad &84 &= 0101\ 0100_2 \\ 81 &= 0101\ 0001_2 \qquad\qquad &72 &= 0100\ 1000_2 \\ 92 &= 0101\ 1100_2 \qquad\qquad &59 &= 0011\ 1011_2 \\ 69 &= 0100\ 0101_2 \qquad\qquad &31 &= 0001\ 1111_2 \\ 3 &= 0000\ 0011_2 \qquad\qquad &87 &= 0101\ 0111_2 \end{align*} $$
Some of these terms are negative and must be converted using two's complement.

$$ \begin{align*} -81 &= ~(0101\ 0001_2) + (0000\ 0001_2) = (1010\ 1110_2) + (0000\ 0001_2) = 1010\ 1111_2 \\ -69 &= ~(0100\ 0101_2) + (0000\ 0001_2) = (1011\ 1010_2) + (0000\ 0001_2) = 1011\ 1011_2 \\ \ -3 &= ~(0000\ 0011_2) + (0000\ 0001_2) = (1111\ 1100_2) + (0000\ 0001_2) = 1111\ 1101_2 \\ -59 &= ~(0011\ 1011_2) + (0000\ 0001_2) = (1100\ 0100_2) + (0000\ 0001_2) = 1100\ 0101_2 \end{align*} $$
Now we can add our terms.

$$ \begin{array}{lcccl} 1.& 9 + 84 & = & & 0000\ 1001_2 \\ & & & + & 0101\ 0100_2 \\ & & & & \rule[2pt]{4em}{.5pt} \\ & & & & 0101\ 1101_2 \\ & & = & & 64 + 16 + 8 + 4 + 1 = 80 + 13 = 93\ \Large{\color{green}\checkmark} \\ \\ 2.& -81 + 72 & = & & 1010\ 1111_2 \\ & & & + & 0100\ 1000_2 \\ & & & & \rule[2pt]{4em}{.5pt} \\ & & & & 1111\ 0111_2 \\ & & & & \text{compute two's complement} \\ & & & & 0000\ 1000_2 \\ & & & + & 0000\ 0001_2 \\ & & & & \rule[2pt]{4em}{.5pt} \\ & & & & 0000\ 1001_2 \\ & & = & & -(8 + 1) = -9\ \Large{\color{green}\checkmark} \\ \\ 3.& 92 - 59 & = & & 0101\ 1100_2 \\ & & & + & 1100\ 0101_2 \\ & & & & \rule[2pt]{4em}{.5pt} \\ & & & & 0010\ 0001_2 \\ & & = & & 32 + 1 = 33\ \Large{\color{green}\checkmark} \\ \\ 4.& -69 + 31 & = & & 1011\ 1011_2 \\ & & & + & 0001\ 1111_2 \\ & & & & \rule[2pt]{4em}{.5pt} \\ & & & & 1101\ 1010_2 \\ & & = & & \text{compute two's complement} \\ & & & & 0010\ 0101_2 \\ & & & + & 0000\ 0001_2 \\ & & & & \rule[2pt]{4em}{.5pt} \\ & & & & 0010\ 0110_2 \\ & & & & -(32 + 4 + 2) = -38\ \Large{\color{green}\checkmark} \\ \\ 5.& -3 + 87 & = & & 1111\ 1101_2 \\ & & & + & 0101\ 0111_2 \\ & & & & \rule[2pt]{4em}{.5pt} \\ & & & & 0101\ 0100_2 \\ & & = & & 64 + 16 + 4 = 64 + 20 = 84\ \Large{\color{green}\checkmark} \end{array} $$

3.1.7 Hexadecimal ¶

Although computers represent all information using binary, humans find long binary strings very cumbersome. In most cases we group the binary string in nibbles. Recall that a nibble can have values from $0 = 0000b$ to $15 = 1111b$. We discussed base 5 and base 2. Now let us discuss base 16, or hexadecimal. In base 16 we will need 16 digits

$$ 0,\ 1,\ 2,\ 3,\ 4,\ 5,\ 6,\ 7,\ 8,\ 9,\ \text{a},\ \text{b},\ \text{c},\ \text{d},\ \text{e},\ \text{f} $$
Notice we use the first six letters of the alphabet to represent the base 10 numbers

$$ \text{a}_{16} = 10,\ \text{b}_{16} = 11,\ \text{c}_{16} = 12,\ \text{d}_{16} = 13,\ \text{e}_{16} = 14,\ \text{f}_{16} = 15 $$

Example 5 ¶

Represent the following numbers in hexadecimal

$$ \begin{align*} 13 &= \\ 27 &= \\ 88 &= \\ 1760 &= \\ \end{align*} $$
The process mirrors the process we used for base 5. Hexadecimal numbers are indicated by using the base subscript like all other numbers to a nonstandard base, or the adding the prefix $0x$ to the number instead of the subscript.

$$ \begin{align*} 13 &= 0x\text{d} = \text{d}_{16} \\ 27 &= 16 + 11 = 0x1\text{b} = 1\text{b}_{16} \\ 88 &= 5\cdot16 + 8 = 0x58 = 58_{16} \\ 1760 &= 1536 + 224 = 5\cdot256 + 16\cdot14 = 0x5\text{e} = 5\text{E}_{16} \\ \end{align*} $$
Capital letters or lower case letter a-e may be used in hexadecimal notation.

Example 6 ¶

Convert the following binary numbers to hexadecimal

$$ \begin{align*} 10111b &= \\ 1010101b &= \\ 1101011b &= \end{align*} $$
We must group each binary string into groups of four binary digits or a nibble.

$$ \begin{align*} 10111b &= 0001\ 0111b = 0x17 = 17_{16} \\ 1010101b &= 0101\ 0101b = 0x55 = 55_{16} \\ 1101011b &= 0110\ 1011b = 0x6\text{b} = 6\text{b}_{16} \end{align*} $$

Your use of this self-initiated mediated course material is subject to our Creative Commons License .

Creative Commons Attribution-NonCommercial-ShareAlike 4.0

Creative Commons Logo - Black
Attribution
You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

Creative Commons Logo - Black
Noncommercial
You may not use the material for commercial purposes.

Share Alike
You are free to share, copy and redistribute the material in any medium or format. If you adapt, remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.