Math 451: Computational Mathematics

3.4 Finding Zeros of a Function

3.4.1 Array Programming ¶

Array Programming consists of applying a mathematical operation to each of the elements of an entire array of data. Matrix addition is an example of array programming since the result of adding two matrices of the same shape is a matrix with the same shape, and each element of the resulting matrix is the sum of the corresponding elements of the terms. For example

$$ \begin{bmatrix} 49 & 48 & \ 8 & 40 \\ \ 8 & 25 & 22 & 48 \\ 39 & 41 & 46 & 33 \end{bmatrix} + \begin{bmatrix} \ 2 & 34 & 20 & 36 \\ 43 & 38 & 33 & 2 \\ 47 & 38 & 9 & 14 \end{bmatrix} = \begin{bmatrix} 51 & 82 & 28 & 76 \\ 51 & 63 & 55 & 50 \\ 86 & 79 & 55 & 47 \end{bmatrix} $$
For each element of the first matrix we add the corresponding element of the second matrix to obtain the corresponding sum in the resulting matrix. If we call the first matrix $A$, the second $B$, and the third $C$ we have

$$ A + B = C $$
We can denote this idea simply. For two matrices with the same shape.

Definition 3.4.1 ¶

Matrix Addition

Given $A, B\in\mathbb{R}^{m\times n}$, the sum of two matrices in $\mathbb{R}^{m\times n}$ is defined by

$$ A + B = \begin{bmatrix} a_{ij} \end{bmatrix} + \begin{bmatrix} b_{ij} \end{bmatrix} = \begin{bmatrix} a_{ij}+b_{ij} \end{bmatrix} = \begin{bmatrix} c_{ij} \end{bmatrix} = C $$
for every $1\le i\le m$, and $1\le j\le n$.

This works for vectors in a computer because a vector is just an $n\times 1$ matrix. This also works for arrays. An $n$ dimensional array requires $n$ subscripts to identify every element of the array. However the array is stored column-wise

A = randi([0 9], 5, 4, 3)
B = randi([-9 9], 5, 4, 3)
C = A + B

A(:,:,1) =

       0              7              9              6       
       2              4              5              6       
       8              5              5              3       
       0              2              2              3       
       9              4              4              9       


A(:,:,2) =

       0              2              1              9       
       8              3              6              8       
       9              6              4              3       
       7              1              7              6       
       0              7              7              1       


A(:,:,3) =

       0              6              1              1       
       7              6              2              9       
       5              8              8              7       
       4              8              0              5       
       9              5              4              4       


B(:,:,1) =

      -8             -8              3             -1       
       3              6              0             -1       
      -9              6              9              6       
      -8              4              3             -8       
       0             -7              6             -7       


B(:,:,2) =

      -6             -2             -4             -7       
      -2              1             -1             -2       
       6             -2             -9             -6       
       6              3              9              0       
      -8              2             -6             -3       


B(:,:,3) =

       9             -1             -4              3       
       8              1              4             -6       
      -8              8              3             -7       
       5             -2              1              9       
      -4              9              4             -6       


C(:,:,1) =

      -8             -1             12              5       
       5             10              5              5       
      -1             11             14              9       
      -8              6              5             -5       
       9             -3             10              2       


C(:,:,2) =

      -6              0             -3              2       
       6              4              5              6       
      15              4             -5             -3       
      13              4             16              6       
      -8              9              1             -2       


C(:,:,3) =

       9              5             -3              4       
      15              7              6              3       
      -3             16             11              0       
       9              6              1             14       
       5             14              8             -2

In this case we must rely on the fact that any array can be accessed by a single index, if we realize the order in which the array is stored

T = A(:);
for k=0:9
    fprintf('%d,\t ', T(k+1:10:end))
    fprintf('\n')
end

0,	 9,	 0,	 1,	 0,	 1,	 
2,	 5,	 8,	 6,	 7,	 2,	 
8,	 5,	 9,	 4,	 5,	 8,	 
0,	 2,	 7,	 7,	 4,	 0,	 
9,	 4,	 0,	 7,	 9,	 4,	 
7,	 6,	 2,	 9,	 6,	 1,	 
4,	 6,	 3,	 8,	 6,	 9,	 
5,	 3,	 6,	 3,	 8,	 7,	 
2,	 3,	 1,	 6,	 8,	 5,	 
4,	 9,	 7,	 1,	 5,	 4,

In this case, the sum of two arrays can be written in MATLAB pseudo-code

$$ A + B = [ a_k ] + [ b_k ] = [ a_k + b_k ] = [ c_k ] = C $$
for every element of matrices $A$ and $B$ for $k$ from $1$ to numel(A) , where numel is a built-in MATLAB function that returns the total number of elements of the input array.

In array process languages, most binary and unary mathematical operations may be performed in just this way. That is so that the mathematical operation(s) are performed on each individual element of the array(s). These languages include APL, J, Fortran 90, MATLAB, Analytica, TK Solver (as lists), Octave, R, Cilk Plus, Julia, Perl Data Language (PDL) to name a few.

In MATLAB we decorate a mathematical operator with the . , dot, prefix to indicate a array operation. This is done because modern computer processors are capable of executing mathematical operation on arrays of data at the hardware level . This greatly improves execution time! The improvement in execution time results from the hardware iterating through the addresses of the array operands and returning control back to your program only after the entire array of computations is completed.

In MATLAB these array operations are called element-wise operations . For example if $A$ and $B$ are both $n\times n$ matrices, then

$$ A*B $$
indicates matrix-matrix multiplication. That means for every column of matrix $B$ we have

$$ A*B = A*\begin{bmatrix} \mathbf{b}_1 & \mathbf{b}_2 & \dots & \mathbf{b}_n \end{bmatrix} = \begin{bmatrix} A*\mathbf{b}_1 & A*\mathbf{b}_2 & \dots & A*\mathbf{b}_n \end{bmatrix} $$
However element-wise multiplication is defined

$$ A.*B = \begin{bmatrix} a_j \end{bmatrix} .* \begin{bmatrix} b_j \end{bmatrix} = \begin{bmatrix} a_jb_j \end{bmatrix} $$
for $1\le j\le\ $ numel $(A)$.

A.*B

ans(:,:,1) =

       0            -56             27             -6       
       6             24              0             -6       
     -72             30             45             18       
       0              8              6            -24       
       0            -28             24            -63       


ans(:,:,2) =

       0             -4             -4            -63       
     -16              3             -6            -16       
      54            -12            -36            -18       
      42              3             63              0       
       0             14            -42             -3       


ans(:,:,3) =

       0             -6             -4              3       
      56              6              8            -54       
     -40             64             24            -49       
      20            -16              0             45       
     -36             45             16            -24

3.4.2 Vectorization ¶

We call replacing loops with array processing vectorization . This is due to the fact that a for loop iterates or executes a code block for each value in a vector or list of possible values for the loop variable. When we write

for i=1:10
    % scalar processing the ith element
end

the action performed by the MATLAB interpreter is to create a vector of values

>> list = 1:10

ans =

       1       
       2       
       3       
       4       
       5       
       6       
       7       
       8       
       9       
      10       

>>

Then assign a loop variable i to the first value in the list 1 , and execute the code block with i = 1 . When the end of the code block is reached, assign the loop variable the next value in the list and execute the code bock again. When the end of the list is reached, exit the for loop structure.

Assigning the loop variable to a value in the list, and executing the code block with the loop variable assigned to a value in the list is called an iteration of the loop.

Let us consider our homework assignment for computing the value of the $n$th harmonic number.

n = 57;
harmonicnumber = 0;
for k=1:n
    harmonicnumber = harmonicnumber + (1/k);
end
fprintf('\nThe %d th harmonic number is %g\n', n, harmonicnumber)

The 57 th harmonic number is 4.62901

If we consider this algorithm carefully, we see that the sum we are trying to create has a simple pattern to it

$$ \begin{align*} \text{harmonic}(n) &= \displaystyle\sum_{k=1}^n \dfrac{1}{k} \\ &= 1 + \dfrac{1}{2} + \dfrac{1}{3} + \dfrac{1}{4} + \dots + \dfrac{1}{n} \end{align*} $$
We can easily create the vector from 1 to n!

> disp(1:n)

   1   2   3   4   5   6   7   8   9  ...   n

Can we use element-wise division to create a vector

$$ \dfrac{1}{1}\quad \dfrac{1}{2}\quad \dfrac{1}{3}\quad \dfrac{1}{4}\quad \dfrac{1}{5}\quad \dots\quad \dfrac{1}{n} $$

format rat
n = 6
1:n
1./(1:n)
sum(1./(1:n))

n = 57
harmonicnumber = sum(1./(1:n));
fprintf('\nThe %d th harmonic number is %g\n', n, harmonicnumber)

n =

       6       


ans =

       1              2              3              4              5              6       


ans =

       1              1/2            1/3            1/4            1/5            1/6     


ans =

      49/20    


n =

      57       


The 57 th harmonic number is 4.62901

Replacing a for loop in a mathematical algorithm with array processing is called vectorizing the algorithm.

3.4.3 Vectorizing floatpi.m ¶

Consider the following algorithm for computing the value of $\pi$ using the terms of the Taylor Series

$$ \pi = \displaystyle\sum_{n=0}^{\infty} (-1)^n\dfrac{4}{2n+1} $$
The algorithm utilizes a for loop.

n = 57;
    y = 1;           % the first term of the series is (-1)^0*4/1 = 1

        % for a zero based sum, n terms is 0 through n-1
        for k=1:n-1
            y = y + ((-1)^k)./(2*k+1);
        end

        y = y*4;
    end

format short
tic
abs(floatpi(2^20)-pi)
elapsed = toc

ans =

   9.5367e-07


elapsed =

    0.2784

In order to get an accuracy of $10^{-9}$, we required $2^{30}$ iterations through our loop and 32.19 seconds to compute our approximation. Can we vectorize this loop?

We need to create the a vector of the first n odd numbers $1$, $3$, $5$, $\dots$, $2n-1$. Now we need the first n fractions $\dfrac{4}{2n+1}$,

$$ 1,\ \frac{4}{3},\ \dfrac{4}{5},\ \dfrac{4}{7},\ \dots,\ \dfrac{4}{2n+1} $$
We also need these terms to alternate in sign

$$ 1\ -\ \frac{4}{3}\ +\ \dfrac{4}{5}\ -\ \dfrac{4}{7}\ +\ \dots,\ +\ (-1)^n\dfrac{4}{2n+1} $$

format rat
n = 6
1:2:2*n-1
4./(1:2:2*n-1)
(-1).^(0:n-1).*4./(1:2:2*n-1)
format short
sum((-1).^(0:n-1).*4./(1:2:2*n-1))

fprintf("\nVectorized version:\n")
tic
abs(vectorpi(2^20) - pi)
elapsed = toc

n =

       6       


ans =

       1              3              5              7              9             11       


ans =

       4              4/3            4/5            4/7            4/9            4/11    


ans =

       4             -4/3            4/5           -4/7            4/9           -4/11    


ans =

    2.9760


Vectorized version:

ans =

   9.5367e-07


elapsed =

    0.0322

$.2784 / .0322 = 8.6460$ so the vectorized version is over 8 times faster than the for loop!

One must be careful however. If you call vectorpi and ask for $2^{30}$ terms, the algorithm will attempt to create two vector of length $2^{30}$ and multiply them element-wise them and sum them up. This will require $8\cdot2^{30} = 2^{24} = 16777216$ gigabytes of ram! By comparison, a vector of length $2^{20}$ only require 16 gigabytes of ram so my computer can handle it.

Clearly when choosing to vectorize your algorithms you must carefully consider runtime versus memory. This question of resources will be a significant consideration when designing your algorithms.

Your use of this self-initiated mediated course material is subject to our Creative Commons License .

Creative Commons Attribution-NonCommercial-ShareAlike 4.0

Creative Commons Logo - Black
Attribution
You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

Creative Commons Logo - Black
Noncommercial
You may not use the material for commercial purposes.

Share Alike
You are free to share, copy and redistribute the material in any medium or format. If you adapt, remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.