Sometimes It Pays to Give Your Professor a Little Attention

I was forced to broaden my horizons in my PhD pursuit.  That’s because, although I’ve done coding since I was eighteen, I had to acquire a deeper understanding for two things: linear algebra and numerical methods.  It’s no understatement to say that both of these are at the core of the advances wrought by computerisation, whether we’re talking about statistical analysis or (in my case) simulation.

After my initial boffo performance, I turned to my Iranian friends for more help.  So they let me use some of the books they found useful for study back in the “old country”.  One of those was a sizeable book entitled Applied Numerical Methods by Brice Carnahan, H.A. Luther and James O. Wilkes.  As was the case with their wedding video, the heart skipped a beat, because the middle author, Hubert A. Luther, was my Differential Equations teacher at Texas A&M, forty years ago this spring.

Applied Numerical Methods was, AFAIK, the first really comprehensive textbook which combined linear algebra, numerical methods, and coding (in their case, FORTRAN IV) in one text.  Although some of the methodologies have been improved since it was published in 1969, and languages have certainly changed, it’s still a very useful book, although a little dense in spots.  Many of the books on the subject that have come afterwards have learned from its mistakes, but still refer back to the original.

Dr. Luther taught me the last required math class in my pursuit of an engineering degree at Texas A&M.  It wasn’t an easy class, even after three semesters of calculus (which I did reasonably well at).  Although he was originally from Pennsylvania, he acclimated himself to the Lone Star State with western shirt, belt and string tie, the only professor I can remember who did so. The start to his course was especially rough; the textbook was terrible, he was a picky grader, the scores I got back were low.  I thought I was facing the abyss…until another one of those “aha” moments came along.

We (the engineering students) were standing outside our Modern Physics class, which came before Differential Equations.  I found out I wasn’t the only one having this problem.  But one of my colleagues, a Nuclear Engineering student who went on to become my class’ wealthiest member, had a simple suggestion.  Go visit his office, he said.  He’s lonely (he was nearing retirement) and likes the company.  Your grade will go up.

I wasn’t much for visiting my professors, but I was desperate enough to try anything.  I made a couple of office visits.  I’m not sure how helpful his advice was, but his grading became more lenient and I got through the course OK.

Today I’m on the other end of the visitation.  I spend a lot of time in the office with no student visits.  Part of the problem comes from scheduling, both theirs and mine.  But I’ve found out something else about student visits: the students that come to see you really care about what they’re supposed to be doing in your class.  Although there are still students who think it their duty to “tough it out” without asking questions, many others just want to get through in the quickest and least time-consuming way they can find.

I’m glad I took my classmate’s advice and made the office visits.  But there are two other lessons I have learned since that time.

The first is that I wish I had taken a numerical methods course taught by Dr. Luther, it would have prepared me for what I’ve been doing both before and during the time of my PhD pursuit.

The second is that, when I started my MS degree twenty years later, I took a course over basically the same material taught by a Russian.  I found out that there was a great deal I hadn’t learned from Dr. Luther, and that American math education leaves a lot to be desired of.  So sometimes making the way easier up front comes back to get you in the end.

Advertisements

If You Really Want to Get Into Trouble, Read the Mediaevals

Although it’s been out there for a long time, these days the clash between religion and science has been especially heated.  The simplest way to solve the problem would be to cancel elections and let the self-proclaimed know-it-alls run the show.  In this way they could ignore the religious “masses” and insure the continuity of their funding.  Funded scientists are happy scientists…

But what would happen if there was a synergy between the two?  Basically the same thing that is happening between religion and science now: an academic slugfest.  (Reminds one of Rabbi Jonathan Sacks’ joke about “the tradition”…)    That’s pretty much the bottom line of the life of the life of Georg Cantor (1845-1918) and his formulation of set theory.

He was born in St. Petersburg, Russia to parents who originally came from Denmark.  When he was eleven, they moved to Frankfurt, in the German Electorate of Hesse (the Hessians were the ones George Washington crossed the Delaware to defeat at Trenton).  As Carl Boyer notes in his A History of Mathematics:

His (Cantor’s) parents were Christians of Jewish background–his father had been converted to Protestantism, his mother had been born a Catholic.  The son Georg took a strong interest in the finespun arguments of medieval theologians concerning continuity and the infinite, and this militated against his pursuing a mundane career in engineering as suggested by his father.  In his studies at Zurich, Göttingen and Berlin the young man consequently concentrated on philosophy, physics and mathematics–a program that seems to have fostered his unprecedented mathematical imagination.

His central “claim to fame” is the elucidation of set theory (or, as the Germans are wont to call it, mengenlehre).  It’s not an understatement that set theory has come to dominate the teaching of mathematics and its conceptualisation, as I found out the hard way taking advanced linear algebra a couple of years ago, complete with the bizarre notation that has just about taken over math textbooks.  In the 1960’s it was the centrepiece of the “new math” that came into primary and secondary school curricula, and that was controversial, but a great deal more useful than its critics would admit.

The controversy didn’t end with the sets themselves.  Cantor realised that set theory forced him to consider something that mathematicians had danced around for almost two centuries: infinite quantities, or more precisely transfinite quantities.  Sets can have an infinite number of elements, but just what that means was something Cantor plunged very deeply into.

It’s easy to get lost in Cantor’s reasoning, as the concepts he proposed are very profound.  I’ll try to keep things as uncomplicated as I can, taking the risk that I may oversimplify the business.

Let us consider the set of integers.  We know instinctively that there are an infinite number of integers.  Now let us consider the number of even integers.  You’d think that there are half as many even integers as all integers, right?  But both quantities are in fact infinite, which means that dividing it by two doesn’t mean much.  In fact Cantor proved that, if we considered the set of all integers and the set of even integers,  we would have a one-to-one correspondence between each member of each set.  So the size of the two sets is equal, even though one set is a subset of the other.

Things get more complicated when we pass from integers and rational numbers to transcendental numbers like e and pi. Cantor proved that the number of transcendental numbers was larger than that of either/or or both/and the integers and rational numbers, even though all of them were infinite.  Cantor had shown, in effect, that not all infinities were equal to each other!

One device that Cantor, and just about anyone else who deals with transfinite numbers, uses is the limit.  But one major difference between Cantor and many of his contemporaries–and predecessors–is that Cantor showed that infinity was in fact an existing quantity, the problem with the transcendentals not withstanding.

That lit several fuses.  Before Cantor’s time the French mathematician Cauchy stated the following:

I protest against the use of an infinite magnitude as something completed, which is never permissible in mathematics.  Infinity is merely a way of speaking, the true meaning being a limit which certain ratios approach indefinitely close.

The most deadly grenade pin that Cantor pulled, however, was that of Leopold Kronecker (1823-1891), after whom the famous Kronecker Delta is named.  Kronecker, like Cantor of Jewish origins but a Christian, famously stated that “God made the integers, and all the rest is the work of man.” Kronecker made a career out of academically trashing Cantor, blocking appointments and delaying publications.  Cantor, not the scholarly pugilist the situation called for (he should have read Jerome with the mediaevals) had his first nervous breakdown in 1884.  After that time he published little, and died in a psychiatric hospital in 1918, although by then his work was receiving the recognition it deserved.   David Hilbert pithily stated that “From the paradise created for us by Cantor, no one will drive us out”.

So how did the mediaevals influence this revolution in mathematics? The problem of infinity wasn’t as far-fetched as you might think.  It had sat there since Newton and Leibniz set forth the calculus, which in turn hangs on infinitesimals.  Between two finite points there is an infinite number of infinitesimals.  Mediaevals have been jeered for wondering how many angels could dance on the head of a pin, but as long as they were infinitesimals, the answer is clear for finite pins.  It was only a matter of someone putting infinitesimals and infinities together, and that person (with help from others) was Cantor.

Anyone who has explored the philosophy of the scholastics with a mathematical background sooner or later will consider the relationship between their idea and the mathematics of infinity.  Coming off of a master’s degree, I found myself doing that in My Lord and My God.  Although I would not dare to rank myself any where near Cantor, I discovered that all infinities were not equal, and, although they could not have a finite ratio with finite quantities, they were not necessarily equal to each other.  That in turn helped me to see that subordination in God does not impair the deity of the subordinate persons, which solves many problems.  Unfortunately there are those who either can’t or–ahem–won’t see that relationship, and there is always the problem that prelates and seminary academics and are often mathematically challenged.

Today we live in a world where science and religion are forcibly bifurcated.  But it was not always so.  Cantor–and Kronecker and others for that matter–allowed the two to intermingle, and before that Euler was more religiously conservative than Voltaire.  And the Nineteenth Century in Europe was a golden age in mathematics, where advances came one after the other.

But there’s a price.  If you want to get into serious trouble, read the mediaevals, and that’s true for mathematicians and theologians alike.

Note: in addition to Boyer’s book, I used Jane Muir’s Of Men and Numbers: The Story of the Great Mathematicians (Dover Books on Mathematics) in writing this piece.

Numerical Integration of a Function in a Two-Dimensional Space, and Its Solution with Conjugate Gradient

The purpose of this piece is to document the numerical integration of a function in a two-dimensional space using the conjugate gradient method.

Basic Problem Statement and Closed Form Solution

The boundary value problem is as follows:
{\frac{\partial^{2}}{\partial{x}^{2}}}u(x,y)+{\frac{\partial^{2}}{\partial{y}^{2}}}u(x,y)=-5\, y\sin(x)\sin(2\, y)+4\,\sin(x)\cos(2\, y)

where
u(x,0)=0
u(x,\pi)=0
u(0,y)=0
u(\pi,y)=0

A plot of the right hand side of this is shown below.

pasted11

To solve this in closed form, we need to consider our solution method. Given the form of Poisson’s Equation above, it is tempting to use a double Fourier series (or alternatively a complex exponential solution with constant coefficients) with only a few (1-2) terms in the solution. Complicating this is the presence of the first order term for y . One way to deal with this is to consider including the linear term in an assumed solution, as follows:

u(x,y)=y{e^{\sqrt{-1}\alpha\, x}}{e^{\sqrt{-1}\omega\, y}}

Transforming this into sines and cosines with a more general solution, we can write

u(x,y)=A_{{1}}y\cos(\alpha\, x)\cos(\omega\, y)-A_{{2}}y\sin(\alpha\, x)\sin(\omega\, y)+\sqrt{-1}A_{{3}}y\sin(\alpha\, x)\cos(\omega\, y)+\sqrt{-1}A_{{4}}y\cos(\alpha\, x)\sin(\omega\, y)

Applying the operator to this yields

{\frac{\partial^{2}}{\partial{x}^{2}}}u(x,y)+{\frac{\partial^{2}}{\partial{y}^{2}}}u(x,y)=-A_{{1}}y\cos(\alpha\, x){\alpha}^{2}\cos(\omega\, y)+A_{{2}}y\sin(\alpha\, x){\alpha}^{2}\sin(\omega\, y)-\sqrt{-1}A_{{3}}y\sin(\alpha\, x){\alpha}^{2}\cos(\omega\, y)-\sqrt{-1}A_{{4}}y\cos(\alpha\, x){\alpha}^{2}\sin(\omega\, y)-2\, A_{{1}}\cos(\alpha\, x)\sin(\omega\, y)\omega-A_{{1}}y\cos(\alpha\, x)\cos(\omega\, y){\omega}^{2}-2\, A_{{2}}\sin(\alpha\, x)\cos(\omega\, y)\omega+A_{{2}}y\sin(\alpha\, x)\sin(\omega\, y){\omega}^{2}-2\,\sqrt{-1}A_{{3}}\sin(\alpha\, x)\sin(\omega\, y)\omega-\sqrt{-1}A_{{3}}y\sin(\alpha\, x)\cos(\omega\, y){\omega}^{2}+2\,\sqrt{-1}A_{{4}}\cos(\alpha\, x)\cos(\omega\, y)\omega-\sqrt{-1}A_{{4}}y\cos(\alpha\, x)\sin(\omega\, y){\omega}^{2}

The idea now is to eliminate terms of sine and cosine combinations that do not appear in the second partial derivatives.

The simplest place to start is to state the following, based on the structure of same partial derivatives:

\alpha=1
\omega=2

This harmonizes the sine and cosine functions. By inspection, we can also state that

A_{1}=A_{4}=0

since these terms do not appear in the problem statement. The values of A_{2} and A_{3} are not as obvious, but based on the previous statement our assumed solution (equated to the problem statement) reduces to

5\, A_{{2}}y\sin(x)\sin(2\, y)-5\,\sqrt{-1}A_{{3}}y\sin(x)\cos(2\, y)-4\, A_{{2}}\sin(x)\cos(2\, y)-4\,\sqrt{-1}A_{{3}}\sin(x)\sin(2\, y)=-5\, y\sin(x)\sin(2\, y)+\sin(x)\cos(2\, y)\,

This can be reduced to a linear system as follows:

5\, A_{{2}}y-4\,\sqrt{-1}A_{{3}}=-5\, y

-5\,\sqrt{-1}A_{{3}}y-4\, A_{{2}}=4\, y

In matrix form, the equation becomes

\left[\begin{array}{cc} 5\, y & -4\,\sqrt{-1}\\ \noalign{\medskip}-4 & -5\,\sqrt{-1}y \end{array}\right]\left[\begin{array}{c} A_{2}\\ A_{3} \end{array}\right]=\left[\begin{array}{c} -5\, y\\ \noalign{\medskip}4 \end{array}\right]

Inverting the matrix and multiplying it by the right hand side yields

\left[\begin{array}{c} A_{2}\\ A_{3} \end{array}\right]=\left[\begin{array}{cc} 5\,{\frac{y}{25\,{y}^{2}+16}} & -4\,\left(25\,{y}^{2}+16\right)^{-1}\\ \noalign{\medskip}4\,{\frac{\sqrt{-1}}{25\,{y}^{2}+16}} & 5\,{\frac{\sqrt{-1}y}{25\,{y}^{2}+16}} \end{array}\right]\left[\begin{array}{c} -5\, y\\ \noalign{\medskip}4 \end{array}\right]\\ =\left[\begin{array}{c} -25\,{\frac{{y}^{2}}{25\,{y}^{2}+16}}-16\,\left(25\,{y}^{2}+16\right)^{-1}\\ \noalign{\medskip}0 \end{array}\right]\\ =\left[\begin{array}{c} -1\\ 0 \end{array}\right]

This yields our solution,
u(x,y)=y\sin(x)\sin(2\, y)

A plot is shown below.

pasted2

Solution by Conjugate Gradient

The residual norm history is shown below for n=9, 19 and 39, where n is the number of nodes on each axis to produce a grid. The plot shows the decrease in the 2-norm of the residual as the iterations progress. The iterations are stopped when they either exceed 2000 or if the 2-norm falls below 10^{-4}.

pasted5a

All of the discretisations converged within 40 steps. In any event the conjugate gradient method will converge in no more steps than the row/column number of the matrix, and in this case the performance of the method far exceeded that, doubtless in part to the fact that we employed a preconditioner to accelerate the solution.

As mentioned in the code comments, the method used is conjugate gradient as outlined in Gourdin and Boumahrat (2003). The method begins by computing the original residual:

r^{\left(0\right)}=b-Ax^{\left(0\right)}

We then solve the following equation

Cp^{\left(0\right)}=r^{\left(0\right)}

At this point we need to consider the conditioning matrix C. It is defined as follows:

C=TT^{t}

where T is the lower diagonal portion of A. As an aside, this is an incomplete Cholesky factorization. Obviously T^{t} is the upper diagonal, A being symmetric. Since we have an upper diagonal solver, it would make sense to compute and invert C using T^{t}.

To accomplish this, we invert this to yield

C^{-1}=\left(TT^{t}\right)^{-1}=\left(T^{t}\right)^{-1}\left(T\right)^{-1}

We can accomplish the first term of the right hand side with the information at hand. For the second term, considering

\left(T^{t}\right)^{-1}=\left(T^{-1}\right)^{t}

we transpose both sides to yield

\left(T\right)^{-1}=\left(\left(T^{t}\right)^{-1}\right)^{t}

and substituting

C^{-1}=\left(T^{t}\right)^{-1}\left(\left(T^{t}\right)^{-1}\right)^{t}

We can use the factoring method to invert T^{t}, then take its transpose, then multiply \left(T^{t}\right)^{-1} by its transpose to yield C^{-1}, which is in essence our conditioning matrix and which can be used to solve for the vector p.

Returning to the algorithm, to save calculations later we define the
vector

s=C^{-1}r

and solving

p^{\left(0\right)}=C^{-1}r^{\left(0\right)}

we equate
s^{\left(0\right)}=p^{\left(0\right)}

We now begin our iteration for k=1,2,3,\ldots k_{max}. We first compute

\tilde{\alpha}=\frac{\left(r^{\left(k\right)}\right)^{t}s^{\left(k\right)}}{\left(p^{\left(k\right)}\right)^{t}Ap^{\left(k\right)}}

for which purpose we developed a special subroutine. We then update our step as follows:

x^{\left(k+1\right)}=x^{\left(k\right)}+\tilde{\alpha}p^{\left(k+1\right)}

r^{\left(k+1\right)}=b-Ax^{\left(k+1\right)}

Using the conditioning matrix we developed, we compute the following

s^{\left(k+1\right)}=C^{-1}r^{\left(k+1\right)}

and (using a similar routine we used for \tilde{\alpha})

\tilde{\beta}=\frac{\left(r^{\left(k+1\right)}\right)^{t}s^{k+1}}{\left(r^{\left(k\right)}\right)^{t}s^{\left(k\right)}}

and

\tilde{p}^{\left(k+1\right)}=s^{\left(k+1\right)}+\tilde{\beta}^{\left(k+1\right)}p^{\left(k\right)}

From here we index k and repeat the cycle until either we reach k_{max} or our convergence criterion.  It should be noted that the efficiency of conjugate gradient was such that, particularly at the finer discretizations, the routine spent more time generating the preconditioning matrices than it did in actually going through the cycles!

Solution Code

The solution codes were written in FORTRAN 77, and is presented below.  The traditional FORTRAN column spacing got messed up in the cut and paste. Also presented is some of the code which was incorporated using the INCLUDE statement of Open WATCOM FORTRAN. The size of the grid was varied using the parameter statement at the beginning of the problem,and is generally set to 39 for the codes presented. It was varied to 9 and 19 as needed.

We should also note that the routine was run in single precision.

 c Solution of Two-Dimensional Grid Using
 c Conjugate Gradient Iteration
 c Implementation of conjugate gradient method
 c as per Gourdin and Boumahrat (2003)
 c Includes preconditioning by construction of matrix C
 c Matrix size changed via parameter statement
 include 'mpar.for'
 parameter(pi=3.141592654,istep=2000)
 c Define arrays for main array, diagonal block, off-diagonal block,
 c and solution and rhs vectors
 dimension a(nn,nn),d(n,n),c(n,n),u(nn),b(nn)
 c Define conditioning and related matrices
 dimension cond(nn,nn),t(nn,nn),tt(nn,nn)
 c Define residual vectors and norm for residual
 dimension r(nn),rold(nn),p(nn),s(nn),sold(nn),xnorm(istep)
 c Coordinate Arrays
 dimension x(n,n),y(n,n)
 c Error Vector for Final Iterate
 dimension uerr(nn)
 c Statement function to define rhs
 f(x1,y1)=-5*y1*sin(x1)*sin(2*y1)+4*sin(x1)*cos(2*y1)
 c Statement function to define closed form solution
 fexact(x1,y1)=y1*sin(x1)*sin(2*y1)
 c Write Header for Output
 write(*,*)'Conjugate Gradient Iteration to solve',
 &' two-dimensional grid'
 write(*,*)'Grid Size = ',n,' x ',n
 write(*,*)'Matrix Size ',nn,' x ',nn
 call tstamp
 c Initialise diagonal block array
 do 20 i=1,n,1
 do 20 j=1,n,1
 if(i.eq.j) then
 d(i,j)=4.
 elseif(abs(i-j).eq.1) then
 d(i,j)=-1.
 else
 d(i,j)=0.
 endif
 20 continue
 c Initialise off-diagonal block array
 do 30 i=1,n,1
 do 30 j=1,n,1
 if(i.eq.j) then
 c(i,j)=-1.
 else
 c(i,j)=0.
 endif
 30 continue
 c Compute grid spacing h
 xh=pi/float(n+1)
 c Initialise rhs
 do 60 j=1,n,1
 do 60 i=1,n,1
 x(i,j)=float(i)*xh
 y(i,j)=float(j)*xh
 ii=(j-1)*n+i
 b(ii)=-f(x(i,j),y(i,j))*xh**2
 write(*,61)i,j,ii,x(i,j),y(i,j),b(ii)
 61 format(3i5,3f10.3)
 60 continue
 c Insert block matrices into main matrix
 call blkadd(d,c,a)
 c Initialise result vector
 do 70 ii=1,nn,1
 70 u(ii)=1.0
 c Compute first residual vector, using residual vector as temporary storage
 call xmvmat(a,u,r)
 do 75 ii=1,nn,1
 r(ii)=b(ii)-r(ii)
 75 continue
 c Construct upper triangular matrix tt by stripping lower diagonal portion of
 c matrix a
 do 76 ii=1,nn,1
 do 76 jj=1,nn,1
 if(jj-ii)78,77,77
 77 tt(ii,jj)=a(ii,jj)
 goto 76
 78 tt(ii,jj)=0.0
 76 continue
 c Invert upper triangular matrix tt by factorisation method
 call uminv(tt)
 c Construct inverted matrix t by transposing inverted matrix tt
 call trnspz(tt,t)
 c Multiply t by tt to obtain preconditioning matrix c (which is actually
 c c**(-1))
 call xmvmul(tt,t,cond)
 c Compute initial vector p by multiplying cond (inverse of c) by residual
 call xmvmat(cond,r,p)
 c Set initial vector s to p
 do 79 ii=1,nn,1
 79 s(ii)=p(ii)
 do 80 kk=1,istep,1
 c set old values of vectors r and s
 do 190 ii=1,nn,1
 rold(ii)=r(ii)
 190 sold(ii)=s(ii)
 c Compute alpha for each step
 alpha1=alpha(a,p,r,s)
 c Update value of u
 do 181 ii=1,nn,1
 181 u(ii)=u(ii)+alpha1*p(ii)
 c Update residual for each step r = b-Au
 call xmvmat(a,u,r)
 do 82 ii=1,nn,1
 r(ii)=b(ii)-r(ii)
 82 continue
 c Update vector s
 call xmvmat(cond,r,s)
 c Compute value of beta for each step
 beta1=beta(r,rold,s,sold)
 c Update value of p vector
 do 195 ii=1,nn,1
 195 p(ii)=s(ii)+beta1*p(ii)
 c Invoke function to compute Euclidean norm of residual
 c Kicks the iteration out once tolerance is reached
 xnorm(kk)=vnorm(r,nn)
 if(xnorm(kk).lt.1.0e-04)goto 83
 write(*,*)kk,xnorm(kk)
 80 continue
 83 continue
 do 90 j=1,n,1
 do 90 i=1,n,1
 ii=(j-1)*n+i
 uerr(ii)=fexact(x(i,j),y(i,j))-u(ii)
 90 continue
 uerrf=vnorm(uerr,nn)
 write(*,*)'Number of iterations = ',kk
 write(*,*)'Euclidean norm for final error =',uerrf
 open(2,file='m5610p1d.csv')
 if(kk.gt.istep)kk=istep
 do 40 ii=1,kk,1
 write(2,50)ii,xnorm(ii)
 50 format(i5,1h,,e15.5)
 40 continue
 close(2)
 stop
 end
c Subroutine to insert nxn blocks into n**2xn**2 array
 c Assumes all diagonal blocks are the same
 c Assumes all off-diagonal blocks (upper and lower) are the same
 c Assigns zero values elsewhere in the array
 subroutine blkadd(d,c,a)
 include 'm5610par.for'
 dimension a(nn,nn),d(n,n),c(n,n)
 c Insert blocks into array
 do 20 ii=1,n,1
 do 20 jj=1,n,1
 c Determine row and column index in master array for corner of block
 ic=(ii-1)*n+1
 jc=(jj-1)*n+1
 do 30 i=1,n,1
 do 30 j=1,n,1
 iii=ic+i-1
 jjj=jc+j-1
 if(ii.eq.jj)then
 a(iii,jjj)=d(i,j)
 elseif(abs(ii-jj).eq.1)then
 a(iii,jjj)=c(i,j)
 else
 a(iii,jjj)=0.
 endif
 30 continue
 20 continue
 return
 end
c Subroutine to multiply a nn x nn matrix with an nn vector
 c a is the nn x nn square matrix
 c b is the nn vector
 c c is the result
 c nn is the matrix and vector size
 c Result is obviously an nn vector
 c Routine loosely based on Gennaro (1965)
 subroutine xmvmat(a,b,c)
 include 'm5610par.for'
 dimension a(nn,nn),b(nn),c(nn)
 do 20 i=1,nn,1
 c(i)=0.0
 do 20 l=1,nn,1
 20 c(i)=c(i)+a(i,l)*b(l)
 return
 end
c Subroutine to multiply a nn x nn matrix with another nn x nn matrix
 c a is the first nn x nn square matrix
 c b is the second nn x nn matrix
 c c is the result
 c nn is the matrix size
 c Result is obviously an nn x nn matrix
 c Routine based on Gennaro (1965)
 subroutine xmvmul(a,b,c)
 include 'm5610par.for'
 dimension a(nn,nn),b(nn,nn),c(nn,nn)
 do 20 i=1,nn,1
 write(*,*)'Multiplying Row ',i
 do 20 j=1,nn,1
 c(i,j)=0.0
 do 20 l=1,nn,1
 20 c(i,j)=c(i,j)+a(i,l)*b(l,j)
 return
 end
c Subroutine to invert a mm x mm upper triangular matrix using factor method
 c u is input and output matrix
 c t is result matrix, written back into the output matrix
 c mm is matrix size
 c Result overwrites original matrix
 c Based on Gennaro (1965)
 subroutine uminv(u)
 include 'm5610par.for'
 dimension u(nn,nn),t(nn,nn)
 c Zero all entries in matrix t except for diagonals, which are the reciprocals
 c of the diagonals of the orignal matrix
 do 20 i=1,nn,1
 do 20 j=1,nn,1
 if(i-j)11,10,11
 10 t(i,j)=1.0/u(i,j)
 goto 20
 11 t(i,j)=0.0
 20 continue
 c Compute strictly upper triangular entries of inverted matrix
 do 40 i=1,nn,1
 write(*,*)'Inverting Row ',i
 do 40 j=1,nn,1
 if(j-i)40,40,31
 31 l=j-1
 do 41 k=i,l,1
 t(i,j)=t(i,j)-t(i,k)*u(k,j)/u(j,j)
 41 continue
 40 continue
 c Write back result into original input matrix
 do 50 i=1,nn,1
 do 50 j=1,nn,1
 50 u(i,j)=t(i,j)
 return
 end
c Function to compute Euclidean norms of vector
 function vnorm(V,no)
 dimension V(no)
 znorm=0.
 do 100 ia=1,no,1
 100 znorm=znorm+V(ia)**2
 vnorm=sqrt(znorm)
 return
 end
c Function to determine scalar multiplier alpha for conjugate gradient method
 c Function returns a scalar which is multiplied by residual vector
 c Function uses subroutine xmvmat to multiply matrix A by residual
 c A is main matrix
 c r is residual vector
 c nn is size of matrix/vector
 c rgrad is scalar multiplier for gradient method
 function alpha(a,p,r,s)
 include 'm5610par.for'
 dimension a(nn,nn),r(nn),pt(nn),p(nn),s(nn)
 c Compute dot product of r and s for numerator
 rnumer=0.0
 do 10 ii=1,nn,1
 rnumer=rnumer+r(ii)*s(ii)
 10 continue
 c Compute matrix product of A * p
 call xmvmat(a,p,pt)
 c Premultiply matrix product by transpose of p vector for denominator
 rdenom=0.0
 do 20 ii=1,nn,1
 rdenom=rdenom+p(ii)*pt(ii)
 20 continue
 alpha = rnumer/rdenom
 return
 end
c Function to determine scalar multiplier beta for conjugate gradient method
 c Function returns a scalar which is multiplied by p vector
 c r,s are vectors is residual vector
 c nn is size of matrix/vector
 c rgrad is scalar multiplier for gradient method
 function beta(r,rold,s,sold)
 include 'm5610par.for'
 dimension r(nn),rold(nn),s(nn),sold(nn)
 c Compute dot product for numerator
 rnumer=0.0
 do 10 ii=1,nn,1
 rnumer=rnumer+r(ii)*s(ii)
 10 continue
 c Compute dot product for denominator
 rdenom=0.0
 do 20 ii=1,nn,1
 rdenom=rdenom+rold(ii)*sold(ii)
 20 continue
 beta = rnumer/rdenom
 return
 end
c Subroutine to transpose a matrix
 c Adapted from Carnahan, Luther and Wilkes (1969)
 subroutine trnspz(a,at)
 include 'm5610par.for'
 dimension a(nn,nn),at(nn,nn)
 do 14 ii=1,nn,1
 do 14 jj=1,nn,1
 14 at(jj,ii)=a(ii,jj)
 return
 end
include 'tstamp.for'

Included Routines

  • tstamp.for calls a OPEN WATCOM function and time stamps the output.
  • mpar.for sets the parameter statements and is shown below.
 parameter(n=39,nn=n*n)

References

  • Carnahan, B., Luther, H.A., and Wilkes, J.O. (1969) Applied Numerical Methods. New York: Wiley.
  • Gennaro, J.J. (1965) Computer Methods in Solid Mechanics. New York: Macmillan.
  • Gourdin, A., and Boumahrat, M. (2003) Applied Numerical Methods. New Delhi: Prentice-Hall India.

The Importance of Causality

In his book Introduction to the Differential Equations of Physics, German physicist Ludwig Hopf opens with the following statement:

Any differential equation expresses a relation between derivatives or between derivatives and given functions of the variables.  It thus establishes a relation between the increments of certain quantities and these quantities themselves.  This property of a differential equation makes it the natural expression of the principle of causality which is the foundation of exact natural science.  the ancient Greeks established laws of nature in which certain relation between numbers (harmony of spheres) or certain shapes of bodies played a privileged role.  The law was supposed to state something about a process as a whole, or about the complete shape of a body.  In more recent times (Galileo, Newton, etc.) a different concept has been adopted.  We do not try to establish a relation between all phases of a process immediately, but only between one phase and the next.  A law of this type may express, for example, how a certain state will develop in the immediate future, or it may describe the influence of the state of a certain particle on the particles in the immediate neighbourhood.  Thus we have a procedure for the description of a law of nature in terms of small (mathematically speaking, infinitesimal) differences of time and space.  The increments with which the law is concerned appear as derivatives, i.e., as the limits of the quotient of the increments of the variables which describe the process over the increment of space or time in wihch this development takes place.  A law of nature of this form is the expression of the relation between one state and the neighbouring (in time or space) states and therefore represents a special form of the principle of causality.

The whole issue of causality is an important one for both scientific and theological reasons, and I want to touch on one of each.

Every event that takes place in the universe is a result of an event before it.  Those events in turn are the results of those which have gone before.  All of these events form a chain which leads back to the first cause.  The need for the first cause is one of St. Thomas Aquinas’ proofs of God’s existence:

The second way is from the nature of the efficient cause. In the world of sense we find there is an order of efficient causes. There is no case known (neither is it, indeed, possible) in which a thing is found to be the efficient cause of itself; for so it would be prior to itself, which is impossible. Now in efficient causes it is not possible to go on to infinity, because in all efficient causes following in order, the first is the cause of the intermediate cause, and the intermediate is the cause of the ultimate cause, whether the intermediate cause be several, or only one. Now to take away the cause is to take away the effect. Therefore, if there be no first cause among efficient causes, there will be no ultimate, nor any intermediate cause. But if in efficient causes it is possible to go on to infinity, there will be no first efficient cause, neither will there be an ultimate effect, nor any intermediate efficient causes; all of which is plainly false. Therefore it is necessary to admit a first efficient cause, to which everyone gives the name of God.

Although, as Hopf points out, our understanding of how that causality actually works in the physical universe is different from the Greeks (and Thomas Aquinas worked in a Greek concept of natural philosophy) the truth of the importance of causality is undiminished.

To determine what comes after is a major reason for differential equations, which contain three elements: the equation itself, the initial conditions and the boundary conditions.  Once we have these, we can predict the behaviour of a system.  In some cases we can do so with a “simple” equation, others require discretisation and numerical modelling.  And that leads to our second point.

It’s interesting that Hopf speaks of “exact natural science.”  Today much of science and engineering is driven by probabalistic considerations, which in turn lead to statistical analysis.  Probability and statistics is a very useful tool, but not a substitute for the understanding of the actual mechanisms by which things work.  The actual mechanisms (physical laws, etc.) are what cause the phenomena which we record as statistics, not the other way around.  The fact that there are variations in these should not blind us to the core reality.

The advent of computers with broad-based number crunching abilities has only inflated our overconfidence in such methods.  It is essential, however, that we understand the why of phenomena as well as the what.  We must both be able to quantify the results and the correct causes of what is going on around us.  Two recent debacles illustrate this.

The first is the climate change fiasco we’ve been treated to of late.  Removing the dissimulation (as opposed to simulation) of some involved in the science, the core problem is that we do not as of yet have a model of global climate sufficiently comprehensive so that we can dispense with reliance on the statistics and project what will happen with a reasonable degree of confidence.  Part of the problem is the core problem in chaos theory: minor variations in initial conditions lead to major variations in the results.  But without such a model we are bereft with a definitive “why” as much as “what.”

The second is our financial collapse.  The models developed of the elaborate credit structure were fine as far as they went.  But ultimately they were divorced from sustainable reality because they did not take in to consideration all of the factors, many of which were obvious to those with raw experience.

The issue of causality is one that is central to our understanding of the universe.

Proof of Harten’s Lemma re the Convergence of TVNI Finite-Difference Schemes

An academic paper with a rather unusual history is that of the Israeli mathematician Ami Harten’s “High Resolution Schemes for Hyperbolic Conservation Laws.”  First published in the Journal of Computation Physics in 1983, it was republished in 1997 in the same journal, and is often cited with the later date.

At the time of republishing Peter Lax, who earlier had saved the computer from the hippie radicals, made the following statement about this paper in an introduction:

This paper was a landmark; it introduced a new design principle—total variation diminishing schemes—that led, in Harten’s hands, and subsequently in the hands of others, to an efficient, robust, highly accurate class of schemes for shock capturing free of oscillations. The citation index lists 429 references to it, not only in journals of numerical analysis and computational fluid dynamics, but also in journals devoted to mechanical engineering, astronautics, astrophysics, geophysics, nuclear science and technology, spacecraft and rockets, plasma physics, sound and vibration, aerothermodynamics, hydraulics, turbo and jet engines, and computer vision and imaging.

One point in the paper was a lemma concerning the convergence of TVNI (total variation nonincreasing) finite difference schemes.  Concerning the name of these schemes, Lax points out the following:

Harten originally called his schemes variation diminishing, abbreviated TVD; when Osher pointed out the usual meaning of these initials, the name was switched to total variation nonincreasing (TVNI), but was eventually settled on the more euphonious TVD.

The following is an expansion of Harten’s proof of the lemma.

Schemes which are total variation nonincreasing (TVNI) can be characterized as follows:
\sum_{j=-\infty}^{\infty}\left|u_{{j+1,n+1}}-u_{{j,n+1}}\right|\leq\sum_{j=-\infty}^{\infty}\left|\left(u_{{j+1,n}}-u_{{j,n}}\right)\right|

We can thus define
TV\left(u^{n}\right)=\sum_{j=-\infty}^{\infty}\left|\left(u_{{j+1,n}}-u_{{j,n}}\right)\right|

TV\left(u^{n+1}\right)=\sum_{j=-\infty}^{\infty}\left|u_{{j+1,n+1}}-u_{{j,n+1}}\right|

and substituting
TV\left(u^{n+1}\right)\leq TV\left(u^{n}\right)

Consider the general expression
u_{{j,n+1}}=u_{{j,n}}-C_{{j-1,n}}\left(u_{{j,n}}-u_{{j-1,n}}\right)+D_{{j,n}}\left(u_{{j+1,n}}-u_{{j,n}}\right)

where
C_{{j-1,n}}\geq 0

D_{{j,n}}\geq 0 C_{{j-1,n}}+D_{{j,n}}\leq 1

We should observe that our ultimate goal is to sum these values from negative infinity to positive infinity; thus, we can shift the index at will. The inequalities will still hold but the specific location in space may change. It is also worth noting that the coefficients may themselves change at different points in space.

Let us consider the next spatial step, to wit
u_{{j+1,n+1}}=u_{{j+1,n}}-C_{{j,n}}\left(u_{{j+1,n}}-u_{{j,n}}\right)+D_{{j+1,n}}\left(u_{{j+2,n}}-u_{{j+1,n}}\right)

Subtracting the previous spatial step from this yields
u_{{j+1,n+1}}-u_{{j,n+1}}=u_{{j+1,n}}-C_{{j,n}}\left(u_{{j+1,n}}-u_{{j,n}}\right)+D_{{j+1,n}}\left(u_{{j+2,n}}-u_{{j+1,n}}\right)-u_{{j,n}}+C_{{j-1,n}}\left(u_{{j,n}}-u_{{j-1,n}}\right)-D_{{j,n}}\left(u_{{j+1,n}}-u_{{j,n}}\right)

Some rearranging yields
u_{{j+1,n+1}}-u_{{j,n+1}}=\left(u_{{j+1,n}}-u_{{j,n}}\right)\left(1-D_{{j,n}}-C_{{j,n}}\right)+C_{{j-1,n}}\left(u_{{j,n}}-u_{{j-1,n}}\right)+D_{{j+1,n}}\left(u_{{j+2,n}}-u_{{j+1,n}}\right)

Taking the absolute value of both sides, we have
\left|u_{{j+1,n+1}}-u_{{j,n+1}}\right|=\left|\left(u_{{j+1,n}}-u_{{j,n}}\right)\left(1-D_{{j,n}}-C_{{j,n}}\right)+C_{{j-1,n}}\left(u_{{j,n}}-u_{{j-1,n}}\right)+D_{{j+1,n}}\left(u_{{j+2,n}}-u_{{j+1,n}}\right)\right|

At this point we observe that
C_{{j-1,n}}\geq 0
 D_{{j+1,n}}\geq 0
D_{{j,n}}+C_{{j,n}}\leq 1
1-D_{{j,n}}-C_{{j,n}}\geq 0

We can thus limit the absolute values and write the expression as follows:
\left|u_{{j+1,n+1}}-u_{{j,n+1}}\right|\leq\left|\left(u_{{j+1,n}}-u_{{j,n}}\right)\right|\left(1-D_{{j,n}}-C_{{j,n}}\right)+\left|\left(u_{{j,n}}-u_{{j-1,n}}\right)\right|C_{{j-1,n}}+\left|\left(u_{{j+2,n}}-u_{{j+1,n}}\right)\right|D_{{j+1,n}}

Taking the summation for both sides,
\sum_{j=-\infty}^{\infty}\left|u_{{j+1,n+1}}-u_{{j,n+1}}\right|\leq \sum_{j=-\infty}^{\infty}\left|\left(u_{{j+1,n}}-u_{{j,n}}\right)\right|\left(1-D_{{j,n}}-C_{{j,n}}\right)+\sum_{j=-\infty}^{\infty}\left|\left(u_{{j,n}}-u_{{j-1,n}}\right)\right|C_{{j-1,n}}+\sum_{j=-\infty}^{\infty}\left|\left(u_{{j+2,n}}-u_{{j+1,n}}\right)\right|D_{{j+1,n}}

Since, as we observed before, we can shift the indices (as the “centre” of the system is arbitrary with infinite boundaries) we can rewrite the above as follows
\sum_{j=-\infty}^{\infty}\left|u_{{j+1,n+1}}-u_{{j,n+1}}\right| \leq \sum_{j=-\infty}^{\infty}\left|\left(u_{{j+1,n}}-u_{{j,n}}\right)\right|\left(1-D_{{j,n}}-C_{{j,n}}\right)+\sum_{j=-\infty}^{\infty}\left|\left(u_{{j+1,n}}-u_{{j,n}}\right)\right|C_{{j,n}}+\sum_{j=-\infty}^{\infty}\left|\left(u_{{j+1,n}}-u_{{j,n}}\right)\right|D_{{j,n}}

in which case
\sum_{j=-\infty}^{\infty}\left|u_{{j+1,n+1}}-u_{{j,n+1}}\right| \leq \sum_{j=-\infty}^{\infty}\left|\left(u_{{j+1,n}}-u_{{j,n}}\right)\right|

Substituting, we have at last
TV\left(u^{n+1}\right)\leq TV\left(u^{n}\right)

Citation: Harten, A. (1997) “High Resolution Schemes for Hyperbolic Conservation Laws.” Journal of Computational Physics, Vol. 135, pp. 260-278.