Array Programming
consists of applying a mathematical operation to each of the elements of an entire array of data. Matrix addition is an example of array programming since the result of adding two matrices of the same shape is a matrix with the same shape, and each element of the resulting matrix is the sum of the
corresponding
elements of the terms. For example
$$
\begin{bmatrix}
49 & 48 & \ 8 & 40 \\
\ 8 & 25 & 22 & 48 \\
39 & 41 & 46 & 33 \end{bmatrix} +
\begin{bmatrix}
\ 2 & 34 & 20 & 36 \\
43 & 38 & 33 & 2 \\
47 & 38 & 9 & 14 \end{bmatrix} =
\begin{bmatrix}
51 & 82 & 28 & 76 \\
51 & 63 & 55 & 50 \\
86 & 79 & 55 & 47 \end{bmatrix}
$$
For each element of the first matrix we add the corresponding element of the second matrix to obtain the corresponding sum in the resulting matrix. If we call the first matrix $A$, the second $B$, and the third $C$ we have
$$
A + B = C
$$
We can denote this idea simply. For two matrices with the same shape.
Definition 3.4.1 ¶
Matrix Addition
Given $A, B\in\mathbb{R}^{m\times n}$, the sum of two matrices in $\mathbb{R}^{m\times n}$ is defined by
$$ A + B = \begin{bmatrix} a_{ij} \end{bmatrix} + \begin{bmatrix} b_{ij} \end{bmatrix} = \begin{bmatrix} a_{ij}+b_{ij} \end{bmatrix} = \begin{bmatrix} c_{ij} \end{bmatrix} = C $$
for every $1\le i\le m$, and $1\le j\le n$.
This works for vectors in a computer because a vector is just an $n\times 1$ matrix. This also works for arrays. An $n$ dimensional array requires $n$ subscripts to identify every element of the array. However the array is stored column-wise
A = randi([0 9], 5, 4, 3)
B = randi([-9 9], 5, 4, 3)
C = A + B
In this case we must rely on the fact that any array can be accessed by a single index, if we realize the order in which the array is stored
T = A(:);
for k=0:9
fprintf('%d,\t ', T(k+1:10:end))
fprintf('\n')
end
In this case, the sum of two arrays can be written in MATLAB pseudo-code
$$
A + B = [ a_k ] + [ b_k ] = [ a_k + b_k ] = [ c_k ] = C
$$
for every element of matrices $A$ and $B$ for $k$ from $1$ to
numel(A)
, where
numel
is a built-in MATLAB function that returns the total number of elements of the input array.
In array process languages, most binary and unary mathematical operations may be performed in just this way. That is so that the mathematical operation(s) are performed on each individual element of the array(s). These languages include APL, J, Fortran 90, MATLAB, Analytica, TK Solver (as lists), Octave, R, Cilk Plus, Julia, Perl Data Language (PDL) to name a few.
In MATLAB we
decorate
a mathematical operator with the
.
, dot, prefix to indicate a array operation. This is done because modern computer processors are capable of executing mathematical operation on arrays of data at the
hardware level
. This greatly improves execution time! The improvement in execution time results from the hardware iterating through the addresses of the array operands and returning control back to your program only after the entire
array
of computations is completed.
In MATLAB these array operations are called
element-wise operations
. For example if $A$ and $B$ are both $n\times n$ matrices, then
$$
A*B
$$
indicates matrix-matrix multiplication. That means for every column of matrix $B$ we have
$$
A*B = A*\begin{bmatrix} \mathbf{b}_1 & \mathbf{b}_2 & \dots & \mathbf{b}_n \end{bmatrix} =
\begin{bmatrix} A*\mathbf{b}_1 & A*\mathbf{b}_2 & \dots & A*\mathbf{b}_n \end{bmatrix}
$$
However
element-wise multiplication
is defined
$$
A.*B = \begin{bmatrix} a_j \end{bmatrix} .* \begin{bmatrix} b_j \end{bmatrix} = \begin{bmatrix} a_jb_j \end{bmatrix}
$$
for $1\le j\le\ $
numel
$(A)$.
A.*B
We call replacing loops with array processing
vectorization
. This is due to the fact that a
for
loop
iterates
or executes a code block for each value in a vector or list of possible values for the loop variable. When we write
for i=1:10
% scalar processing the ith element
end
the action performed by the MATLAB interpreter is to create a vector of values
>> list = 1:10
ans =
1
2
3
4
5
6
7
8
9
10
>>
Then assign a
loop variable
i
to the first value in the list
1
, and execute the code block with
i = 1
. When the end of the code block is reached, assign the loop variable the next value in the list and execute the code bock again. When the end of the list is reached, exit the for loop structure.
Assigning the loop variable to a value in the list, and executing the code block with the loop variable assigned to a value in the list is called an iteration of the loop.
Let us consider our homework assignment for computing the value of the $n$th harmonic number.
n = 57;
harmonicnumber = 0;
for k=1:n
harmonicnumber = harmonicnumber + (1/k);
end
fprintf('\nThe %d th harmonic number is %g\n', n, harmonicnumber)
If we consider this algorithm carefully, we see that the sum we are trying to create has a simple pattern to it
$$
\begin{align*}
\text{harmonic}(n) &= \displaystyle\sum_{k=1}^n \dfrac{1}{k} \\
&= 1 + \dfrac{1}{2} + \dfrac{1}{3} + \dfrac{1}{4} + \dots + \dfrac{1}{n}
\end{align*}
$$
We can easily create the vector from 1 to n!
> disp(1:n)
1 2 3 4 5 6 7 8 9 ... n
Can we use element-wise division to create a
vector
$$
\dfrac{1}{1}\quad \dfrac{1}{2}\quad \dfrac{1}{3}\quad \dfrac{1}{4}\quad \dfrac{1}{5}\quad \dots\quad \dfrac{1}{n}
$$
format rat
n = 6
1:n
1./(1:n)
sum(1./(1:n))
n = 57
harmonicnumber = sum(1./(1:n));
fprintf('\nThe %d th harmonic number is %g\n', n, harmonicnumber)
Replacing a for loop in a mathematical algorithm with array processing is called vectorizing the algorithm.
Consider the following algorithm for computing the value of $\pi$ using the terms of the Taylor Series
$$
\pi = \displaystyle\sum_{n=0}^{\infty} (-1)^n\dfrac{4}{2n+1}
$$
The algorithm utilizes a for loop.
n = 57;
y = 1; % the first term of the series is (-1)^0*4/1 = 1
% for a zero based sum, n terms is 0 through n-1
for k=1:n-1
y = y + ((-1)^k)./(2*k+1);
end
y = y*4;
end
format short
tic
abs(floatpi(2^20)-pi)
elapsed = toc
In order to get an accuracy of $10^{-9}$, we required $2^{30}$ iterations through our loop and 32.19 seconds to compute our approximation. Can we vectorize this loop?
We need to create the a vector of the first
n
odd numbers $1$, $3$, $5$, $\dots$, $2n-1$. Now we need the first
n
fractions $\dfrac{4}{2n+1}$,
$$
1,\ \frac{4}{3},\ \dfrac{4}{5},\ \dfrac{4}{7},\ \dots,\ \dfrac{4}{2n+1}
$$
We also need these terms to alternate in sign
$$
1\ -\ \frac{4}{3}\ +\ \dfrac{4}{5}\ -\ \dfrac{4}{7}\ +\ \dots,\ +\ (-1)^n\dfrac{4}{2n+1}
$$
format rat
n = 6
1:2:2*n-1
4./(1:2:2*n-1)
(-1).^(0:n-1).*4./(1:2:2*n-1)
format short
sum((-1).^(0:n-1).*4./(1:2:2*n-1))
fprintf("\nVectorized version:\n")
tic
abs(vectorpi(2^20) - pi)
elapsed = toc
$.2784 / .0322 = 8.6460$ so the vectorized version is over 8 times faster than the for loop!
One must be careful however. If you call
vectorpi
and ask for $2^{30}$ terms, the algorithm will attempt to create two vector of length $2^{30}$ and multiply them element-wise them and sum them up. This will require $8\cdot2^{30} = 2^{24} = 16777216$
gigabytes
of ram! By comparison, a vector of length $2^{20}$ only require 16 gigabytes of ram so my computer can handle it.
Clearly when choosing to vectorize your algorithms you must carefully consider runtime versus memory. This question of resources will be a significant consideration when designing your algorithms.
Creative Commons Attribution-NonCommercial-ShareAlike 4.0
Attribution
You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
Noncommercial
You may not use the material for commercial purposes.
Share Alike
You are free to share, copy and redistribute the material in any medium or format. If you adapt, remix, transform, or build upon the material, you must distribute your contributions under the
same license
as the original.