Introduo Anlise Linear

Post on 02-May-2017

217 views

Category:

Documents

TRANSCRIPT

• INTRODUCTIONTO

LINEAR ANALYSIS

Nicholas J. Rose

Mathematics DepartmentNorth Carolina State University

REVISED EDITION

I Difference Equations1.1 Introduction . . . . . . . . . . . . . 11.2 Sequences and Difference Equations . . . . . . . 31.3 Compound Interest . . . . . . . . . . . 71.4 Amortization of a Mortgage . . . . . . . . . 111.5 First Order Linear Difference Equations . . . . . . 131.6 Method of Undetermined Coefficients . . . . . . . 161.7 Complex Numbers . . . . . . . . . . . . 201.8 Fibonacci Numbers . . . . . . . . . . . 271.9 Second Order Linear Difference Equations . . . . . . 301.10 Homogeneous Second Order Difference Equations . . . . 341.11 The Method of Undetermined Coefficients . . . . . . 361.12 A Simple Model of National Income . . . . . . . 391.13 The Gamblers Ruin . . . . . . . . . . . 41

II Differential Equations and the Laplace Transform2.1 Introduction . . . . . . . . . . . . . 432.2 Separation of Variables . . . . . . . . . . . 442.3 Linear Differential Equations . . . . . . . . . 472.4 Laplace Transforms . . . . . . . . . . . 502.5 Properties of Laplace Transforms . . . . . . . . 532.6 Partial Fractions and Use of Tables . . . . . . . . 552.7 Solution of Differential Equations . . . . . . . . 602.8 Convolutions . . . . . . . . . . . . . 622.9 Discontinuous Forcing Functions . . . . . . . . 652.10 The Weighting Function and Impulse Functions . . . . 70

III Matrices and Systems of Equations3.0 Introduction . . . . . . . . . . . . . 773.1 Algebra of n-vectors . . . . . . . . . . . 803.2 Matrix Notation for Linear Systems . . . . . . . 833.3 Properties of Solutions . . . . . . . . . . . 873.4 Elementary Operations, Equivalent Systems . . . . . 903.5 Row Echelon Form and Reduced Row Echelon Form . . . 923.6 Solutions of Systems . . . . . . . . . . . 943.7 More on Consistency of Systems . . . . . . . . 983.8 Matrix Algebra . . . . . . . . . . . 1023.9 Transposes, Symmetric matrices and Powers of Matrices . 1073.10 The Inverse of a Square Matrix . . . . . . . 1113.11 Linear Independence and Dependence . . . . . . 1163.12 Determinants . . . . . . . . . . . . 1233.13 Eigenvalues and Eigenvectors, an Introductory Example . 1303.14 Eigenvalues and Eigenvectors . . . . . . . . 1333.15 Solution of Systems of Differential Equations . . . . 1363.16 Solution of Systems of Difference Equations . . . . 143

Answers to Exercises . . . . . . . . . . . 149

• CHAPTER I

DIFFERENCE EQUATIONS

1.1Introduction

Much of this book is devoted to the analysis of dynamical systems, that is, systems that change withtime. The motion of a body under known forces, the flow of current in a circuit and the decay of aradioactive substance are examples of dynamical systems. If the quantity of interest in a dynamicalsystem is considered to vary continuously with time, the system is called a continuous dynamicalsystem. The physical laws that govern how continuous dynamical systems evolve in time are often givenby equations involving time derivatives of the desired quantities; such equations are called differentialequations. For example, the decay of a radioactive substance is governed by the differential equation

dm(t)/dt = km(t), t 0, (1)

where m(t) is the mass of the substance at time t and parameter k is a positive constant which dependson the particular substance involved. If the initial mass m(0) is known, m(t) can be found by solvingthe differential equation. Methods for solving differential equations will be discussed in Chapter 2.

In this chapter we shall discuss discrete dynamical systems where the quantity of interest isdefined, or desired, only at discrete points in time. Often these discrete time points are uniformlyspaced. Economic data, for instance, is usually obtained by periodic reports daily, weekly, monthly,or yearly. A firm takes inventory perhaps monthly or quarterly. The quantity of interest in a discretedynamical system is sequence of values denoted by {xk}, or completely written out

x0, x1, x2, . . . , xk, . . . , (2)

where x0 represents the value at the beginning of the initial time period, x1, the value at the end of thefirst time period, and xk, the value at the end of the kth time period.

The laws which govern the evolution of a discrete dynamical system are often expressed by equationswhich involve two or more general terms of the sequence of values of the desired quantity. Some examplesof such equations are

xk+1 = 2xk, k = 0, 1, 2, . . .xk+2 2xk+1 + xk = 0, k = 0, 1, 2, . . . (3)

xk = xk1 + 100, k = 1, 2, 3, . . .

Equations of this type which indicate how the terms of a sequence recur are called recurrence rela-tions, or more commonly, difference equations. The methods of solution of certain types of differenceequations will be studied in this chapter.

Example 1. A population of fish is observed to double every month. If there are 100 fish to startwith, what is the population of fish after k months?

Solution Let xk represent the number of fish after k months, then xk+1 is the number of fish afterk + 1 months. Since the number of fish doubles each month, we must have

xk+1 = 2xk, k = 0, 1, 2, . . . (i)

• 2 Chapter IDIFFERENCE EQUATIONS

This is the equation governing the growth of the fish population.Since the initial population of fish is 100, we have x0 = 100. Successively substituting k = 0, 1, 2,

into the difference equation we obtain

x1 = 2x0,

x2 = 2x1 = 2(2x0) = 22x0,

x3 = 2x2 = 2(22x0) = 23x0.

By continuing in this manner the number of fish at the end of any particular month could be found.However, what is really desired is an explicit formula for xk as a function of k. Looking at the expressionsfor x1, x2, and x3 given above, a good guess for xk is

xk = 2kx0 = 2k(100).

To verify this, we first put k = 0 into the above formula to find x0 = 100. Next we note thatxk+1 = 2k+1(100). Substituting xk and xk+1 into the difference equation (i) we find

xk+1 2xk = 2k+1(100) 2 2k(100) = 0 for all k.

Thus xk = 2k 100 is the desired solution.

Example 2. Suppose that every year the growth of a fish population is twice the growth in thepreceding year. Write a difference equation for the number fish after k years.

Solution Let xk denote the number of fish after k years. The growth of fish during the kth yearis xk xk1, and the growth of fish during the (k + 1)st year is xk+1 xk. According to the law ofgrowth stated we must have

xk+1 xk = 2(xk xk1), k = 1, 2, . . .

orxk+1 3xk + 2xk1 = 0, k = 1, 2, . . .

Suppose that the number of fish is initially 100 and that the number at the end of one year is 120, thismeans that x0 = 100 and x1 = 120. To find x2, the number of fish at the end of the 2nd year, we putk = 1 in the difference equation to get

x2 = 3x1 2x0 = 3(120) 2(100) = 160

In a similar manner we could compute x3, x4, . . .. However a formula for xk as an explicit function ofk is not as easy to come by as in the previous example. We shall find out how to do this later in thischapter.

Exercises 1.1

1. For each of the following difference equations find x1, x2, and x3 in terms of x0.a. xn+1 = 2xn, n = 0, 1 , 2, . . . b. xk = 3xk1 + 2k, k = 1, 2, 3, . . .c. xk+1 = (xk)2 2k, k = 0, 1, 2, . . .

2. For the following difference equations find x2 and x3 in terms of x0 and x1.a. xk+2 + 3xk+1 + 2xk = 2k, k = 0, 1, 2, . . . b. xn+1 2xn + xn1 = 0, n = 1, 2 , 3, . . .

3. For each of the following find xk as a function of k and x0 and verify that the difference equationis satisfied.

a. xk+1 = 2xk, k = 0, 1, 2, . . . b. xk+1 = xk, k = 0, 1, 2, . . .c. 2xk = 3xk1, k = 1, 2, 3, . . .

• Section 1.2Sequences and Difference Equations 3

4. Suppose that you get a job with a starting salary of 20,000 dollars and you receive a raise of 10%each year. Write a difference equation for your salary during the kth year.

5. Redo problem 4 if the raise you receive each year is 1000 dollars plus 10%.

6. Initially a population of fish is 1000 and grows to 1200 at the end of one year. If the growthof fish in any year is proportional to the number of fish at the end of the previous year, set up adifference equation for the number of fish at the end of the nth year. Be sure to evaluate the constantof proportionality.

7. The first two terms in a sequence are x1 = 1 and x2 = 1. Thereafter, each term is the sum of thepreceding two terms. Write out the first 5 terms of the sequence and set up a difference equation, withproper initial conditions, for the nth term.

1.2Sequences and Difference Equations

In the previous section the terms sequence, difference equation, and solution of a difference equationwere introduced. Before proceeding, it is important to have a clear understanding of what these termsmean.

Definition 1. A sequence is a function whose domain of definition is the set of nonnegativeintegers 0, 1, 2, . . .. If the function is f , the terms of the sequence are

x0 = f(0), x1 = f(1), x2 = f(2), . . . .

The sequence may be denoted by {xk} or by writing out the terms of the sequence in order

x0, x1, x2, . . . , xk, . . . .

For example, the sequence1, 1, 1, 1, . . . , 1, . . . (1)

is a constant sequence all of whose terms are equal to 1, that is, xk = 1 for all k. The sequence definedby xk = k for k = 0, 1, 2, . . . , when written out is

0, 1, 2, 3, 4, . . . (2)

This is an arithmetic sequence since the difference between successive terms is a constant; in this casethe difference is 1. This sequence could also be defined implicitly by the equation

xk = xk1 + 1, k = 1, 2, 3, . . . (3)

provided we impose the initial condition x0 = 0. The sequence

a, a + d, a + 2d, . . . , a + kd, . . . (4)

is a general arithmetic sequence with an initial term of a and a difference of d. The kth term of thesequence is xk = a + kd. This sequence can be defined by the equation.

xk = xk1 + d, k = 1, 2, 3, . . . (5)

where x0 = a.A geometric sequence is one where each term is obtained by multiplying the preceding term by the

same factor. If the initial term is q and the factor is r, the terms are

q, qr, qr2, qr3, . . . , qrk, . . . (6)

• 0

1

2

3

0 1 2 3 4 5 6 7 8 9 10

- 2

- 1

0

1

2

0 3 6 9 1 2

4 Chapter IDIFFERENCE EQUATIONS

The general term of the sequence is xk = qrk, k = 0, 1, 2, . . .. The sequence can also be definedimplicitly by the equation

xk = rxk1, k = 1 , 2, . . . (7)

together with the initial condition x0 = q.Finally, let us consider the sequence whose first few terms are

1, 1, 2, 3, 5, 8, 13, 21, 34. (8)

What is the next term of the sequence? This is a common type of problem on so called intelligencetests. There is no one correct answer. The next number could be any number at all, since no generalrule or function is given to determine the next term. What is expected on these intelligence tests isto try to determine some pattern or rule that the given terms satisfy and then to use this rule to getadditional terms of the sequence. You may notice by looking at the terms in (8) that each term, afterthe first two terms, is the sum of the preceding two terms. If we assume that this is the pattern thatthe terms satisfy, we get the so-called Fibonacci numbers. That is, the Fibonacci numbers are definedimplicitly by

xk = xk1 + xk2, k = 3, 4, 5, . . . (9)

together with the initial conditions x1 = 1 and x2 = 1. It is now easy to write down additional terms ofthe sequencethe next three terms are 55, 89, 144. It is not so easy to write an explicit formula forthe kth term; this will be done later in this chapter. Notice also that we have chosen to call the initialterm of the sequence x1 instead of x0.

It is often useful to sketch the graph of a given sequence. If the horizontal axis is the k-axis andthe vertical axis is the xk-axis, the graph consists of the points whose coordinates are (0, x0), (1, x1),and so on. The sequences xk = 3 1/2k and yk = 2 cos(k/4) are sketched below.

xk = 3 1/2k yk = 2 cos(k/4)

Figure 1

These graphs provide a picture of what happens to the terms of the sequence as k increases. Forexample, the graphs illustrated that limk xk = 3 while limk yk does not exist. We have drawncurves through the points of the graphs of the sequences so that the changes in the terms of the sequencecan be more easily seen; these curves, however are not part of the graphs.

The equation defining an arithmetic sequence, xk = xk1 + d, expresses the kth term of a sequencein terms of the one immediately preceding term; this equation is called a difference equation of the firstorder. The equation defining a geometric sequence, xk = rxk1 is also a first order difference equation.

• Section 1.2Sequences and Difference Equations 5

The equation which defines the Fibonacci sequence, xk = xk1 + xk2 expresses the kth term in termsof the two immediately preceding terms; it is called a difference equation of the second order .

Definition 2. A difference equation of the pth order is an equation expressing the kth term ofan unknown sequence in terms of the immediately preceding p terms, that is, a pth order equation isan equation of the form

xk = F (k, xk1, xk2, . . . , xkp), k = p, p + 1, p + 2, . . . .

where F is some well defined function of its several arguments. The order of the equation is thereforeequal to the difference between the largest and the smallest subscripts that appear in the equation.

A difference equation expresses the value of the kth term of a sequence as a function of a certainnumber of preceding terms. It does not directly give the value of the kth term. In other words thedifference equation gives the law of formation of terms of a sequence, not the terms themselves. In factthere are usually many different sequences that have the same law of formation. for example, let uslook at the equation xk = xk1 + 2, k = 1, 2, 3, . . .. Any sequence whose successive terms differ by 2satisfies this equation, for instance 0, 2, 4, 6, 8, . . ., or 3, 5, 7, 9, . . .. Each of these sequences is calleda solution of the difference equation.

Definition 3. Let xk be the unknown sequence in a difference equation. By a solution of thedifference equation we mean a sequence given as an explicit function of k, xk = f(k), which satisfiesthe difference equation for all values of k under consideration. By the general solution of a differenceequation we mean the set of all sequences that satisfy the equation.

Example 1. Is xk = 2k a solution of xk = 2xk1, k = 1, 2, 3, . . . ?Since xk = 2k, we have

xk 2xk1 = 2k 2(2k1) = 2k 2k = 0, for all k.

Thus the given sequence is a solution. Similarly it can be shown that the sequence xk = a 2k satisfiesthe difference equation for all values of the parameter a. The difference equation therefore has infinitelymany solutions. We shall see shortly that every solution of the difference equation can be representedin the form xk = a 2k for some value of a; this is the general solution of the difference equation.

Example 2. Show that xk = 1 + 2k satisfies the difference equation

xk = 2xk1 2k + 3, k = 1, 2, 3, . . . .

We find that xk1 = 1 + 2(k 1) = 2k 1, therefore

xk 2xk1 + 2k 3 = 1 + 2k 2(2k 1) + 2k 3 = 0, for all k.

Example 3. Is xk = 2k a solution of the difference equation

xk = xk1 + xk2, k = 2, 3, 4, . . .?

We have xk1 = 2k1 and xk2 = 2k2, so that

xk xk1 xk2 = 2k 2k1 2k2.

This is certainly not zero for all k, therefore, the given sequence is not a solution.

Example 4. Show that xk = cos(k/2) is a solution of xk+2 + xk = 0, k = 0, 1, 2, . . ..

• 6 Chapter IDIFFERENCE EQUATIONS

Solution Since xk+2 = cos((k + 2)/2) = cos( + k/2) = cos(k/2), we have

xk+2 + xk = cos(k/2) + cos(k/2) = 0, k = 0, 1, 2, . . . .

At the risk of beating a simple example to death, let us look again at the simple first orderequation xk = 2xk1, k = 1, 2, 3, . . .. We have seen in Example 1 that this equation has infinitelymany solutions. In order to obtain a unique solution the value of some one term, usually x0, must beprescribed. This is called an initial value problem which we write in the form.

E : xk = 2xk1 k = 1, 2, 3, . . . (10)IC : x0 = 3

where E stands for difference equation and IC for initial condition. It is clear that there is onlyone sequence which satisfies the difference equation and the initial condition. For, if x0 is known, thedifference equation uniquely determines x1, and once any value xk1 is known, the difference equationuniquely determines xk. By the principle of mathematical induction, xk is uniquely determined for allk. It is easy to show that this unique solution is xk = 3 2k, but the point is that we know that thereis one and only one solution beforehand.

In order to determine a unique solution for a second order difference equation, two successivevalues, say x0 and x1, need to be known. For a pth order equation we have the following theorem.

Theorem 1. The initial value problem

E : xk = F (k, xk1, . . . , xkp), k = p, p + 1, . . .IC : x0 = a0, x1 = a1, . . . , xp1 = ap1

has a unique solution xk for each choice of the initial values a0, . . . , ap1.Proof Substituting k = p into the difference equation we find that xp is uniquely determined by

the initial conditions. Once the values of x0 up to xk1 are known, the value xk is uniquely determinedby the equation. By the principle of mathematical induction, xk is uniquely determined for all k.

Finally we mention that a difference equation can be written in many different but equivalentforms. For example, consider

xk = 2xk1 + 2, k = 1, 2, 3, . . . (11)

By replacing k everywhere it appears with k + 1, we obtain

xk+1 = 2xk + 2, k + 1 = 1, 2, 3, . . .

orxk+1 = 2xk + 2, k = 0 , 1, 2, . . . (12)

Equations (11) and (12) are just two different ways of saying the same thing; they differ only in notationand not in content. The three equations below are also equivalent.

xk = xk1 + xk2, k = 2, 3, 4, . . . (13)

xk+1 = xk + xk1, k = 1, 2, 3, . . . (14)

xk+2 = xk+1 + xk, k = 0, 1, 2, . . . (15)

Equation (14) is obtained from (13) by replacing k by k + 1, and (15) is obtained from (14) by againreplacing k by k + 1. Note that it is necessary to change the range of k for which each equation holds.

Exercises 1.2

1. Determine whether or not the given sequence is a solution of the indicated difference equation:a. xk = 5(2)k, xk = 2xk1, k = 1, 2, . . . b. xk = 3 2k + 1, xk+1 = 2xk + 1, k 0c. xk = 2kc + 3k 2k, xk+1 = 2xk + 3k, k = 0, 1, . . .d. xk = sin(k/2), xk+2 + 3xk+1 + xk = 0, k = 0, 1, . . .

• Section 1.3Compound Interest 7

2. Find the values of the constant A, if any, such that the constant sequence xk A is a solution ofthe difference equations:

a. xk = 3xk1 1, k = 1, 2, . . . b. xk+1 = xk, k = 0, 1, 2, . . .c. xk+1 = 2xk + k, k = 0, 1, 2, . . .

3. Find the values of the constant , if any, so that xk = k is a solution ofa. xk+1 = 3xk, k = 0, 1, 2, . . . b. xk+1 = kxk, k = 0, 1, 2, . . .c. xk+1 = 5xk + 6xk1, k = 1, 2, . . . d. xk+2 + 6xk+1 + 5xk = 0, k = 0, 1, 2, . . .e. xk+1 = 3xk + 5k, k = 0, 1, 2, . . . f. xk+2 xk = 1, k = 0, 1, 2, . . .

4. Rewrite each of the following difference equations in terms of xk, xk+1, and xk+2. Also indicatethe appropriate range of values of k.

a. xk = 2xk1, k = 2, 3, . . . b. xk+1 2xk + 3xk1 = 2k3, k = 1, 2, . . .

1.3Compound Interest

An understanding of the cumulative growth of money under compound interest is a necessity for financialsurvival in todays world. Fortunately, compound interest is not difficult to understand and it providesa simple but useful example of difference equations. Here is the central problem.

Suppose an initial deposit of A dollars is placed in a bank which pays interest at the rate of rpper conversion period. How much money is accumulated after n periods?

The conversion period is commonly a year, a quarter, a month or a day. Interest rates are usually givenas nominal annual rates. A monthly interest rate of one-half of one percent or rp = .005, is equivalentto an nominal annual rate of .005 12 = .06 or 6%. In general, if r is the nominal annual rate, andthere are m conversion periods in a year, then rp = r/m.

Let xn denote the amount in the bank at the end of the nth period. the interest earned during the(n + 1)st period is rpxn, thus the amount of money accumulated at the end of the (n + 1)st period isgiven by

xn+1 = xn + rpxn = (1 + rp)xn (1)

Since x0 = A, the amount accumulated must satisfy the initial value problem

E: xn+1 = (1 + rp)xn, n = 0, 1, 2, . . . (2)IC: x0 = A

There is only one sequence which is a solution of this problem. It is rather easy to find this solution.Successively substituting n = 0, 1, 2, into the difference equation we find

x1 = (1 + rp)x0 = (1 + rp)A

x2 = (1 + rp)x1 = (1 + rp)2A

x3 = (1 + rp)x2 = (1 + rp)3A

The obvious guess for xn isxn = (1 + rp)nA, n = 0, 1, 2, . . . (3)

It is easily verified that this is indeed the solution, for

xn+1 (1 + rp)xn = (1 + rp)n+1A (1 + rp)n+1A = 0

Equation (3) is the fundamental formula for compound interest calculations. Tables of (1+rp)n forvarious values of rp and n are available. However, it is very easy to perform the necessary calculationson a modern scientific calculator.

• 00

n

xn

00 n

xn

8 Chapter IDIFFERENCE EQUATIONS

Example 1. If \$1000 is invested in a bank that pays 7.5% interest compounded quarterly, howmuch money will be accumulated after 5 years?

Solution We have rp = .075/4, A = 1000 and, since there are 4 conversion periods in each year,n = 5 4 = 20. Thus, using Equation (3) yields

x20 = (1 + .075/4)201000 = \$1449.95, to the nearest cent.

It is instructive to compare compound interest with simple interest. In simple interest the amountof interest earned in each period is based on the initial amount deposited. Thus for simple interest wehave the difference equation

xn+1 = xn + rA, n = 0, 1, 2, . . . (4)

with x0 = A. The solution of this difference equation is

xn = A + rnA, n = 0, 1, 2, . . . (5)

Therefore for simple interest the difference xn+1 xn is constant while for compound interest the ratioxn+1/xn is a constant; simple interest yields an arithmetic sequence and compound interest yields ageometric sequence. The graphs of these sequences are shown in Figure 1.

xn = A + nrASimple Interest

xn = (1 + rp)nACompound Interest

Figure 1

Modern banks offer a variety of annual interest rates and frequencies of compounding. Withoutsome analysis it is difficult to tell, for instance, whether 6% compounded daily is a better deal than6.25% compounded annually. We shall analyze the general situation using equation (3). Suppose theannual nominal interest rate is r and that there are m compounding periods in a year. The interest perperiod is rp = r/m, and the number of periods in k years is n = km. If we let yk denote the amountaccumulated after k years we have

yk = (1 + r/m)kmy0, k = 0, 1, 2, . . . (6)

where y0 = x0 is the initial amount deposited. To compare various interest policies, it is convenient todefine the effective annual interest rate, rE , to be

rE = (1 + r/m)m 1, (7)

that is, rE represents the increase in a one dollar investment in one year. Equation (6) can now bewritten as

yk = (1 + rE)ky0, k = 0, 1, 2, . . . . (8)

• Section 1.3Compound Interest 9

Compounding Interest rate Amount after Effectivefrequency conversion period one year annual rate

m rp=r/m (1 + r/m)m rE

1 .06/1=.06 (1.06)1=1.06 .064 .06/4=.015 (1.015)4=1.0614 .061412 .06/12=.005 (1.005)12=1.0617 .0617365 .06/365=.000164 (1.000164)365=1.0618 .0618

Table 1 shows the effective annual interest rate for various compounding frequencies, assuming anannual interest rate of r = .06.

A glance at this table shows that the effective annual rate does not show a great increase asthe number of compounding periods increases. We see that the effective annual rate under dailycompounding is .0618. Thus you would be better off with 6.25% compounded annually than with 6%compounded daily.

Example 2. A bank offers compound interest and advertises that it will triple your money in 15years. (a) What is the effective annual interest rate? (b) What is the nominal annual rate if interest iscompounded monthly? (c) What is the nominal annual rate if interest is compounded daily?

Solution (a) Putting k = 15, and y15 = 3y0 into equation (8) we find that

3y0 = (1 + rE)15y0, thus (1 + rE)15 = 3 or rE = 31/15 1 = .07599.

(b) Since m = 12 for monthly compounding we have

1 + rE = (1 + r/12)12 = 31/15, or r = 12(3(1/(12)(15)) 1

)= .07346

(c) For daily compounding r = 365(3(1/(15)(365)) 1

)= .07325.

Finally we discuss continuous compound interest , which is not often offered by banks, probablybecause of the difficulty of explaining it to the general public; also there is little difference in returnover daily compounding for usual interest rates. for continuous compounding we let m, the number ofcompounding periods in a year, approach infinity in equation (6)

yk = limm

(1 + r/m)mky0 ={

limm

(1 + r/m)m}k

y0. (9)

In order to evaluate this limit we recall the definition of e, the base of natural logarithms:

e = limh

(1 + 1/h)h = 2.7182818 . . . . (10)

Now let r/m = 1/h or h = m/r in equation (9). This yields

limm

(1 + r/m)m = limh

(1 + 1/h)hr = limh

{(1 + 1/h)h}r = er.

Using the above fact in equation (9) gives us

yk = erky0. (11)

• 10 Chapter IDIFFERENCE EQUATIONS

This is the amount accumulated at the end of k years with interest compounded continuously. Theeffective interest rate for continuous compounding, that is, the increase in a one dollar investment inone year is

rE = er 1. (12)

For r = .06, we get rE = .0618, which, to four decimal places, is the same as the effective interest ratefor daily compounding shown in table 1.

In formula (9) there is no reason to restrict k to integer values. We replace k by t, allowing t to beany nonnegative real number, and replace yk by y(t) to obtain

y(t) = erty(0). (13)

as the amount accumulated after t years. Differentiating (11) we find thatdy

dt= rerty(0) so that y(t)

satisfies the differential equationdy

dt= ry(t). (14)

This is one of the differential equations we will study later.

Exercises 1.3

1. If \$1000 is invested at 8% compounded quarterly (a) What is the accumulated amount after 10years? (b) What is the effective annual interest rate? (c) How many conversion periods are needed todouble the original amount?

2. If money, at compound interest, will double in ten years, answer the following questions doing allcalculations mentally (a) by what factor will the money increase in 30 years? (b) in 5 years? (c) howmany years will it take for the money to quadruple?

3. If money, at compound interest, doubles in 10 years, by what factor will it increase in 13 years?How long will it take for the money to triple?

4. If \$500 is invested at 9% compounded continuously (a) how much is accumulated after 5.2 years?(b) what is the effective annual interest rate? (c) how long will it take for your money to double?

5. The Taylor series for er about r = 0 is

er = 1 + r + r2/2! + r3/3! + . . . .

For small values of r, er = 1 + r + r2/2 is a good approximation. Use this to determine whether 5%compounded continuously is better than 5.15% compounded annually.

6. a. If the nominal annual interest rate is r and interest is compounded m times a year, show thatthe doubling time in years is

k =ln 2

m ln(1 + r/m) .

b. If interest is compounded continuously, show the doubling time in years is

t =ln 2r

.

c. If r is small show that the result in (a) is approximately (ln 2)/r. (Since ln 2 is about .7, arough formula for the doubling time is .7/r; for example, 7 years at 10% or 12 years at 6%.)

• Section 1.4Amortization of a Mortgage 11

7. Show that the amount accumulated after k years using continuous compounding is

yk = (1 + rE)ky0.

where rE is given by (12). (Thus formula (8) can be used for any frequency of compounding.)

1.4Amortization of a Mortgage

To amortize a debt is to pay it off in periodic payments, often equal in size. Monthly payments on ahouse mortgage or an automobile loan are familiar examples. The problem considered in this sectionis:

Suppose A dollars is borrowed from a bank which charges interest at the rate of rp per periodon the unpaid balance. The debt is to be paid off in equal payment of b dollars at the endof each period. If the debt is to be completely paid off in N periods, what is the size of eachpayment .

Let xn be the amount owed to the bank at the end of the nth period, that is, just after the nth payment.Since the amount owed after the (n + 1)th period is equal to the amount owed after the nth paymentplus the interest charged during the (n+1)st period minus the payment made at the end of the (n+1)stperiod, we can write the following difference equation

xn+1 = xn + rpxn b = (1 + rp)xn b, n = 0, 1, 2, . . . (1)

If we let a = 1 + rp, the amount xn at the end of the nth period satisfies

E: xn+1 = axn b, n = 0, 1, 2, . . . (2)IC: x0 = A.

We shall solve this problem for xn and then choose b so that xN = 0. By successively substitutingn = 0, 1, 2, into the difference equation we find

x1 = ax0 b = aA b,x2 = ax1 b = a(ax0 b) b = a2A (1 + a)b,x3 = ax2 b = a3A (1 + a + a2)b.

It is easy to guess that the solution must be

xn = anA (1 + a + a2 + . . . + an1)b (3)

Recall the formula for the sum of a geometric series

1 + a + a2 + . . . + an1 = (1 an)/(1 a), a '= 1. (4)

Since a = 1 + rp and rp > 0, we have a > 1 so that (4) holds. Thus (3) becomes

xn = anA 1 an

1 a b. (5)

By direct substitution it can be verified that the difference equation and the initial condition aresatisfied. To find the payment b, we set xN = 0 and solve for b.

0 = aNA 1 aN

1 a b or b =(a 1)A1 aN .

• 12 Chapter IDIFFERENCE EQUATIONS

Since a = 1 + rp, we obtain the following result:If A dollars is paid off in N periods with equal payments of b dollars per period with an interest

rate of rp per period then

b =rpA

1 (1 + rp)N. (6)

Example 1. A \$50,000 mortgage is to be paid off in 30 years in equal monthly payments with aninterest rate of 10%. What are the monthly payments.

Solution We have A = 50000, rp = .10/12, and N = 30 12 = 360, thus

b =(.10/12) (50000)

1 (1 + .10/12)360 = \$438.79 .

Exercises 1.4

1. Verify that (5) is the solution of (2).

2. What is the monthly payment necessary to pay off a debt of \$20,000 in twenty years with aninterest charge of 11.5%? What is the total amount paid?

3. If the monthly payment is \$200 and the interest rate is 10%, how much money can be borrowedand be paid off in 20 years?

4. Find and check the solution of the following initial value problems.a. E: xn+1 = xn + b, IC: x0 = c b. E: xn+1 = axn + b, a '= 1, IC: x0 = c

5. Using the results of problem 4, find and check the solution to:a. E: 2xn+1 = 2xn 1, IC: x0 = 2 b. E: 2xn+1 = 3xn + 4, IC: x0 = 0

6. Suppose A dollars is deposited in a bank that pays interest at the rate of rp per period and thatat the end of each period additional deposits of b dollars are made. Write a difference equation for theamount in the bank at the end of the nth period, just after the nth deposit, and show that the solutionis

xn = anA +1 an

1 a b, where a = 1 + rp.

7. In problem 6, how much would be accumulated after 10 years if an initial deposit of \$100 dollarsis made and 10 dollars is deposited each month. Assume that the bank pays 8% interest compoundedmonthly.

8. How much money would need to be deposited initially in a bank which pays 6% interest com-pounded quarterly , in order to withdraw \$500 a month for 20 years. The entire amount is to beconsumed at the end of 20 years.

9. How much would you need to deposit quarterly over a period of 30 years in order to accumulatethe fund needed for the subsequent 20 year withdrawals described in problem 8.

• Section 1.5First Order Linear Difference Equations 13

1.5First Order Linear Difference Equations

A first order linear difference equation is one of the form

akxk+1 = bkxk + ck, k = 0, 1, 2, . . . (1)

where ak, bk and ck are given sequences and xk is the unknown sequence. If the sequence ak is equalto zero for a certain value of k, then, for this value of k, xk+1 cannot be determined from knowledge ofxk. To avoid this unpleasant situation, we assume that ak is never zero, and divide through by ak toobtain the normal form

xk+1 = pkxk + qk, k = 0, 1, 2, . . . (2)

where we have renamed the coefficients as indicated.We shall first consider the special case of (2) when the sequence pk is a constant sequence, pk = a

for all k. Thus our problem is to solve

E: xk+1 = axk + qk, k = 0, 1, 2, . . . (3)IC: x0 = c.

The procedure is the same as we have used previously, namely, write the first few terms, guess at thegeneral term, and check. For k = 0, 1, 2 we find

x1 = ax0 + q0 = ac + q0,

x2 = ax1 + q1 = a(ac + q0) + q1 = a2c + aq0 + q1,

x3 = ax2 + q2 = a3c + a2q0 + aq1 + q2.

The pattern is clear. For xk we expect

xk = akc + ak1q0 + ak2q1 + . . . + aqk2 + qk1. (4)

It can be easily checked that (4) satisfies the difference equation. Using summation notation equation(4) may be written in either of the forms

xk = akc +k1

i=0

qiak1i = akc +

k1

i=0

aiqk1i (5)

Since x0 = c, the summations in (5) should be interpreted as 0 when k = 0.

Example 1.E: xk+1 = 2xk + 3k, k = 0, 1, 2, . . .IC: x0 = c.

Substituting a = 2 and qk = 3k into the first formula in (5) we get

xk = 2k c +k1

i=0

3i2k1i = 2kc + 2k1k1

i=0

(3/2)i.

Summing the geometric series (see equation (4) of Section 1-4) gives the solution

xk = 2kc + 2k1(1 (3/2)k)1 (3/2) = 2

kc + 3k 2k.

• 14 Chapter IDIFFERENCE EQUATIONS

Example 2.E: xk+1 = 2xk + 2k, k = 0, 1, 2, . . .IC: x0 = 0.

Using formula (5) we obtain

xk =k1

i=0

2i2k1i =k1

i=0

2k1 = k2k1.

In the last step notice that each term in the sum is the same, namely, 2k1, and there are k terms.

We now turn to the general linear equation

xk+1 = pkxk + qk, k 0.

If qk = 0 for all k, the equation is called homogeneous, otherwise it is called nonhomogeneous. Let usconsider the solution of the initial value problem for the homogeneous equation

E: xk+1 = pkxk, k 0IC: x0 = c.

(6)

Proceeding by successive substitutions we find

x1 = p0x0 = p0c,x2 = p1x1 = (p0p1)c,

...xk = pk1xk1 = (p0p1 . . . pk1)c, k > 0. (7)

It is convenient to introduce the product notation

n

i=0

ai = a0a1a2 . . . an.

Using this notation the solution (7) of equation (6) can be written

xk =

(k1

i=0

pi

)c, k > 0. (8)

Example 3.E: xk+1 = (k + 1)xk, k 0IC: x0 = 1

Using equation (8) above the solution is

xk =k1

i=0

(i + 1) 1,

orxk = (1 2 k) = k!.

• Section 1.5First Order Linear Difference Equations 15

Example 4.E: yk+1 = kyk, k 0IC: y0 = 1.

Since y1 = 0 y0 = 0, this implies that y2 = 1 y1 = 0 or that yk = 0 for k > 0 the solution is therefore

yk ={

1, k = 00, k > 0.

Finally we consider the nonhomogeneous problem

E: xk+1 = pkxk + qk, k 0IC: x0 = c.

We proceed by successive substitutions

x1 = p0x0 + q0,x2 = p1x1 + q1 = (p1p0)x0 + p1q0 + q1,x3 = p2x2 + q2 = (p2p1p0)x0 + p2p1q0 + p2q1 + q2,

...xk = pk1xk1 + qk1,

xk = x0k1

i=0

pi + q0k1

i=1

pi + q1k1

i=2

pi + . . . + qk2pk1 + qk1.

The solution can be written,

xk = ck1

i=0

pi +k2

j=0

qj

k1

i=j+1

pi + qk1, k 0

x0 = c.

(9)

Example 5.E: xk+1 = (k + 1)xk + (k + 1)!, k 0IC: x0 = c.

From equation (8) we have

xk = ck1

i=0

(i + 1) +k2

j=0

(j + 1)1k1

i=j+1

(i + 1) + k!

= c k! +k2

j=0

k! + k!

= c k! + (k!)k= k!(c + k) k 0.

Exercises 1.5

Solve the following initial value problems.1. xk+1 + 5xk = 0, k 0; x0 = 5 2. xk+1 = 3xk + 2k, k 0; x0 = 23. xk+1 = 3xk + 5 + k, k 0; x0 = 2 4. (k + 1)xk+1 = (k + 2)xk, k 0; x0 = 15. yn+2 = n yn+1 + 1, n 0; y1 = 3

• 16 Chapter IDIFFERENCE EQUATIONS

1.6The Method of Undetermined Coefficients

The following theorems give simple but important relationships between solutions of a nonhomogeneouslinear difference equation and the corresponding homogeneous equation.

Theorem 1. If uk and vk are solutions of the nonhomogeneous equation

xk+1 = pkxk + qk k 0, (1)

then xk = uk vk is a solution of the corresponding homogeneous equation

xk+1 = pkxk. (2)

Proof Since uk, vk are solutions of (1) we know that uk+1 = pkuk + qk and vk+1 = pkvk + qk.Therefore

xk+1 = uk+1 vk+1 = pkuk + qk (pkvk + qk)= pk(uk vk) = pkxk,

so that xk satisfies the homogeneous equation (2).

Theorem 2. The general solution of the nonhomogeneous equation xk+1 = pkxk + qk can bewritten in the form

xk = x(h)k + x

(p)k , (3)

where x(h)k is the general solution of the homogeneous equation xk+1 = pkxk and x(p)k is any one (or

particular) solution of the nonhomogeneous equation.

Proof Let xk be any solution the nonhomogeneous equation (1) and x(p)k be a known particular

solution, then by Theorem 1, xkx(p)k is a solution of the homogeneous equation (2). Thus xkx(p)k = x

(h)k

or xk = x(h)k + x

(p)k .

The main purpose of this section is to describe a simple method for solving the nonhomogeneousequation

xk+1 = axk + qk, k 0, (4)

when qk is a polynomial in k, or an exponential. According to Theorem 2 we need to find the generalsolution of the homogeneous equation

xk+1 = axk, k 0 (5)

The general solution of equation (5) is simply

x(h)k = ak c,

where c is an arbitrary constant (see problem 1). Now it is only necessary to find any one solution ofequation (4). It is easy to guess the form of a solution when qk is a polynomial in k or an exponential.The technique is best illustrated through examples.

Example 1. SolveE: xk+1 = 3xk 4, k 0IC: x0 = 5.

The general solution of the homogeneous equation, xk+1 = 3xk, is x(h)k = 3

k c. We need to find aparticular solution of the nonhomogeneous equation. Because qk = 4 is a constant sequence, we guess

• Section 1.6The Method of Undetermined Coefficients 17

that x(p)k A, where A is a constant which must be determined to satisfy the equation. Since x(p)k A,

also x(p)k+1 A, thus substituting in the difference equation we find

A = 3A 4, or A = 2.

Thus x(p)k = 2 is a particular solution and the general solution is

xk = 3k c + 2.

All that remains is to determine c so that the initial condition is satisfied. Putting k = 0 in the solutionyields

x0 = 5 = 30 c + 2 = c + 2,

or c = 3. The solution is therefore

xk = 3k 3 + 2 = 3k+1 + 2.

which can easily be checked.

Example 2.E: xk+1 = 2xk + 3k, k 0IC: x0 = .

The solution of the homogeneous equation is x(h)k = 2k c. A good guess for a particular solution is

x(p)k = A 3k. Substituting in the difference equation we get

A 3k+1 = 2A3k + 3k.

Dividing by 3k we find 3A = 2A + 1 or A = 1. Thus x(p)k = 3k is a particular solution and

xk = 2k c + 3k

is the general solution. To satisfy the initial condition set k = 0 and x0 = to find that c = 1.Thus the final solution is

xk = 2k( 1) + 3k.

Example 3.E: xk+1 = 2xk + 2k, k 0IC: x0 = 0.

The homogeneous solution is x(h)k = 2k c. For a particular solution we might be tempted to try x(p)k =

A2k, however this cannot work because any constant times 2k is a solution of the homogeneous equationxk+1 = 2xk and therefore cannot also be a solution of the nonhomogeneous equation xk+1 = 2xk + 2k.We must therefore modify our first attempt. It is clear that xk must contain a factor of 2k, the simplestmodification is to try x(p)k = Ak2

k. Substituting in the difference equation we find

A(k + 1)2k+1 = 2Ak2k + 2k.

It is easy to solve for A to obtain A = 1/2. The particular solution is x(p)k = k2k/2 = k2k1 and the

general solution isxk = 2k c + k2k1.

• 18 Chapter IDIFFERENCE EQUATIONS

Using the initial condition x0 = 0 we get c = 0 and the solution is

xk = k2k1.

On the basis of these examples we can formulate a rule which gives the correct form for a particularsolution of

xk+1 = axk + qk, (6)

when qk is an exponential function.

Rule 1. If qk = c bk, a particular solution of equation (6) can be found in the form x(p)k = Abkunless bk is a solution of the homogeneous equation (i.e., unless b = a) in which case a particularsolution of the form x(p)k = Akb

k can be found.

A similar rule exists in case qk is a polynomial in k.

Rule 2. If qk = c0 + c1k + . . . + cmkm, a particular solution of equation (6) can be found in theform x(p)k = A0 + A1k + . . . + Amk

m, unless a = 1, in which case a particular solution exists of the form

xk = (A0 + A1k + . . . + Amkm) k.

Example 4. Find the general solution of 2xk+1 = 3xk + 2kThe homogeneous equation is 2xk+1 = 3xk or xk+1 = 32xk which has x

(h)k = (

32 )

k c for its generalsolution. For a particular solution we use (according to Rule 2)

x(p)k = A + Bk.

Substituting into the difference equations produces

2(A + B(k + 1)) = 3(A + Bk) + 2k,

or(A + 2B) + (B 2)k = 0.

Thus we must have B 2 = 0, or B = 2 and A + 2B = 0 or A = 4. A particular solution istherefore

x(p)k = 4 2k,

and the general solution is

xk =(

32

)kc 4 2k.

Example 5. Find the general solution of xk+1 = xk1+k. The general solution of the homogeneousequation, xk+1 = xk is x

(h)k = c. According to rule 2 the form x

(p)k = A + Bk will not work. The correct

form is x(p)k = (A + Bk)k = Ak + Bk2, we find

A(k + 1) + B(k + 1)2 = Ak + Bk2 1 + k.

Simplifying we obtain(A + B + 1) + (2B 1)k = 0.

Thus we must have A + B + 1 = 0 and 2B 1 = 0 this produces B = 1/2 and A = 32 . A particularsolution is x(p)k =

32k +

12k

2 and the general solution is

xk = c 32k +

12k2.

• Section 1.6The Method of Undetermined Coefficients 19

Example 6. Find a formula for the sum

sk = 12 + 22 + . . . + k2

of the squares of the first k integers.Solution We first write the problem in the form of difference equation. It is easy to see that sk

satisfiesE: sk+1 = sk + (k + 1)2 = sk + 1 + 2k + k2

IC: s1 = 1.

The homogeneous solutions is s(h)k = c. To find a particular solution, we assume

s(p)k = Ak + Bk2 + Ck3.

Substituting into the E: we obtain

A(k + 1) + B(k + 1)2 + C(k + 1)3 = Ak + Bk2 + Ck3 + 1 + 2k + k2.

Equating the constant terms, and the coefficients of k and k2, leads to the values

A = 1/6, B = 1/2, C = 1/3,

and hence to the general solution

sk = c + k/6 + k2/2 + k3/3.

Requiring s1 = 1 leads to c = 0, so that

sk =16k +

12k2 +

13k3 =

16k(k + 1)(2k + 1),

is the desired formula.

Exercises 1.6

1. Show that the general solution of xk+1 = axk is xk = akc. That is, first show xk = akc is a solutionand then let xk be any solution and show it can be written in the form xk = akc for some c.

Solve each of the following initial value problems2. E: 2xk+1 = 5xk 6, k 0 3. E: 2xk+1 = 5xk + 3(3)k1, k 0

IC: x0 = 0 IC: x0 = 14. E: xk+1 = 2xk + 3 2k1, k 0 5. E: xk+1 = 2xk 3 2k, k 1

IC: x0 = 0 IC: x1 = 06. E: xk+1 = xk + 2k, k 1 7. E: xk+1 + 2xk = 3k, k 0

IC: x1 = 0 IC: x0 = 1

For each of the following write down the correct form for a particular solution. Do not evaluatethe constants.

8. xk xk1 = k k2 9. xk+1 + 2xk = 3(2)k5

10. xk+1 2xk = 3(2)k2 11. 2xk+1 = xk + 3 21k

• 20 Chapter IDIFFERENCE EQUATIONS

12. a. Find a formula for the sum sk = 13 + 23 + . . . + k3.

b. Use the results of part (a) and of Example 6 to evaluate10

k=1(k3 6k2 + 7).

13. Consider a community having R residents. At time n, xn residents favor, and R xn residentsoppose, a local community issue. In each time period 100b% of those who previously favored the issue,and 100a% of those who previously opposed the issue, change their position.

a. Write the difference equation describing this process. Then solve this equation, subject to theIC: x0 = A.

b. Describe the limiting behavior of xn. That is, tell what happens as n goes to infinity.

1.7Complex Numbers

Definition and Arithmetic of Complex Numbers.

In many problems in engineering and science, it is necessary to use complex numbers. In thissection we develop the background necessary to deal with complex numbers.

Complex numbers were invented so that all quadratic equations would have roots. The equationx2 = 1 has no real roots, since, if x is real, x2 is non-negative. We therefore introduce a new number,symbolized by the letter i, having the property that

i2 = 1, or i =1. (1)

The two roots of x2 = 1 are now x = i.Let us now consider a general quadratic equation

ax2 + bx + c = 0, a, b, c real and a '= 0. (2)

The quadratic formula yields the roots

x =b

b2 4ac

2a. (3)

If b2 4ac 0 the roots are real. However, if b2 4ac < 0, (3) involves the square root of a negativenumber. In this case we write

b2 4ac =

(1)(4ac b2) =

1

4ac b2 = i

4ac b2. Therefore

(3) becomes

x =b i

4ac b2

2a. (4)

Are these numbers really roots of equation (2)? Let us find out by substituting (4) into (2), wherewe shall use the ordinary rules of algebra with the one additional rule that i2 = 1

a

(b i

4ac b2

2a

)2+ b

(b i

4ac b2

2a

)+ c =

b2 2ib

4ac b2 (4ac b2)4a

+b2 ib

4ac b2

2a+ c = 0.

The equation is satisfied! We conclude that if we treat numbers of the form x + iy with x and y realusing the ordinary rules of algebra supplemented by the one additional rule that i2 = 1, we are ableto solve all quadratic equations.

With this background, we define a complex number to be a number of the form

z = a + ib, a, b real numbers. (5)

• Section 1.7Complex Numbers 21

The number a is called the real part of z and we write a = Re z and the number b is called the imaginarypart of z and we write b = Im z; note that the imaginary part of z is a real number, the coefficient of iin (5). Numbers of the form a+ i0 will be identified with the real numbers and we use the abbreviationa + i0 = a, Numbers of the form 0 + ib are called pure imaginary numbers and we use the abbreviation0 + ib = ib. The number 0 + i0 is the complex number zero, which we simply denote by 0.

Addition, subtraction and multiplication of complex numbers are defined below.

(a + ib) + (c + id) = (a + c) + i(b + d) (6),(a + ib) (c + id) = (a c) + i(b d) (7),

(a + ib)(c + id) = (ac bd) + i(bc + ad) (8).

Note that these definitions are easy to remember since they are the usual rules of algebra with theadditional rule that i2 = 1.

Before considering division of complex numbers, it is convenient to define the conjugate of a complexnumber z = a + ib, denoted by z, to be z = a ib. We find that

zz = (a + ib)(a ib) = (a2 + b2) + i0 = a2 + b2. (6)

so that the product of a complex number and its conjugate is a real number. Now, to divide twocomplex numbers, we multiply numerator and denominator by the conjugate of the denominator sothat the new denominator is real.

a + ibc + id

=a + ibc + id

c idc id =

ac + bdc2 + d2

.

provided c2 +d2 '= 0. However c2 +d2 = 0 implies that c = d = 0 making the denominator equal to zero;thus division of complex numbers (just as for real numbers) is defined except when the denominator iszero.

Example 1. (2 + 3i)(1 2i) = 8 i.

Example 2.1 2i3 + 4i

=1 2i3 + 4i

3 4i3 4i =

5 10i25

= 15 2

5i.

Example 3. i4 = 1, i3 = i, i1 = 1i

=i

i2= i, i2 = 1, i3 = i.

Example 4. i75 = i(418+3) = (i4)18 i3 = i.

Addition, subtraction, multiplication, and division (except by zero) of complex numbers yielduniquely defined complex numbers. It can be verified that the same associative, commutative anddistributive laws that hold for real numbers also hold for complex numbers. (In the language of modernalgebra, the set of all complex numbers form a field).

Complex numbers were invented to allow us to solve quadratic equations. It might be expectedthat in order to solve cubic, quartic or higher degree equations it would be necessary to introduce morenew numbers like i. However, this is not the case. Gauss proved what is called the FundamentalTheorem of Algebra: Every polynomial equation with complex coefficients has a complex root.

It is important to note that if two complex numbers are equal, a + ib = c + id then their real partsmust be equal, a = c, and their imaginary parts must be equal, b = d. In other words, one equationbetween complex numbers is equivalent to two equations between real numbers.

Example 5. Find all possible square roots of 3 + 4i; that is, find all complex numbers z such thatz2 = 3 + 4i.

Let z = x + iy, then we must have (x + iy)2 = 3 + 4i or

(x2 y2) + i(2xy) = 3 + 4i.

• (x,y)

y

x

r

!!!!

Figure 1 Figure 2

z 1

z 2z 2z1

z 2z1

-

+

22 Chapter IDIFFERENCE EQUATIONS

Setting real and imaginary parts of both sides equal we find that x2 y2 = 3 and 2xy = 4. Thusy = 2/x and x2 4/x2 = 3, or x4 3x2 4 = (x2 4)(x2 + 1) = 0. We must have x2 = 4 or x2 = 1.Since x must be real, we find x = 2. If x = 2, y = 1, while if x = 2, y = 1. Thus there are twosquare roots, z1 = 2 + i and z2 = 2 i. It is easily verified that z12 = z22 = 3 + 4i.

Geometric Interpretation Polar Trigonometric Form

A complex number z = x+iy is completely determined by knowing the ordered pair of real numbers(x, y). Therefore we may interpret a complex number as a point in the xy plane, or equally well, asa vector from the origin the the point (x, y) as shown in Figure 1. If complex numbers are interpretedas vectors then addition and subtraction of complex numbers follow the usual parallelogram law forvectors as shown in Figure 2.

In order to see what happens geometrically when we multiply or divide complex numbers, it is convenientto use polar coordinates (r, ) as shown in Figure 1. We have

z = x + iy = r(cos + i sin ). (7)

where r is the distance from the origin to the point (x, y) and is the angle between the positive xaxisand the vector z. We call z = x + iy the rectangular form of z and z = r(cos + i sin ) the polartrigonometric form of z. The distance r is called the absolute value or modulus of the complex numberz, also denoted by |z|, and is called the angle or argument of z. From Figure 1 we derive the relations

r = |z| =

x2 + y2, = arctany

x. (8)

The polar form allows a nice geometrical interpretation of the product of complex numbers. Letz = r(cos + i sin ) and z = r(cos + i sin ) be two complex numbers. Then

zz = r(cos + i sin )r(cos + i sin )= rr {(cos cos sin sin ) + i(cos sin + sin cos )}= rr(cos( + ) + i sin( + ).

(9)

Therefore to multiply two complex numbers we multiply their absolute values and add their angles.In particular since i = 1(cos/2 + i sin/2), multiplication of a complex number z by the number irotates z by 90 in the positive (counterclockwise) direction.

Now dividing z by z we findz

z=

r

r cos + i sin cos + i sin

=r

r cos + i sin cos + i sin

cos i sin

cos i sin

=r

r (cos cos

+ sin sin ) + (sin cos cos sin )cos 2 + sin 2

=r

r(cos( ) + i sin( )).

(10)

• 1

1

1 + i

"4

2

Section 1.7Complex Numbers 23

Thus to divide one complex number by another we divide the absolute values and subtract the angles.In particular, dividing a complex number z by i, rotates z by 90 in the negative sense.

Formula (9) may be used to find powers of complex numbers. If z = r(cos + i sin ), thenz2 = r2(cos 2 + i sin 2), z3 = r3(cos 3 + i sin 3). By induction we may prove, for n a nonnegativeinteger:

zn = (r(cos + i sin )n = rn(cos n + i sin n). (11)

For negative powers we define, for z '= 0, zn = 1/zn, where n is a positive integer. Using (10) and (11)we find

zn =1zn

=1

rn(cos n + i sin n)= rn(cos(n) + i sin(n)) = rn(cos n i sin n). (12)

Putting r = 1 in (15) and (16) we have the famous formula of De Moivre:

(cos + i sin )n = (cos n + i sin n), (13)

which holds for all integers n.

Example 6. To find (1 + i)4, we first write 1 + i in polar form (see figure below): 1 + i =2(cos/4 + i sin/4). Thus

(1 + i)4 = (

2(cos/4 + i sin/4))4

=

24(cos + i sin)

= 4(1 + i0) = 4.

The Polar Exponential Form and Eulers Forms

Consider the function E() defined by

E() = cos + i sin , a real number. (14)

For each , E() is a complex number of absolute value 1. As varies, E() moves along the unit circlewith center at the origin. Using equations (13)(15) we deduce the following properties of this function:

E() E() = E( + ),E()E()

= E( ),

(E())n = E(n), (n an integer),d

dE() =

d

d(cos + i sin )

= sin + i cos = i(cos + i sin ) = iE().

The first three of these properties are shared by the real exponential function ea. The last propertysuggests that a = i. This motivates the definition

ei = cos + i sin . (15)

This and the companion formulaei = cos i sin . (16)

• 1

-1

-1 + i

2 3"4

24 Chapter IDIFFERENCE EQUATIONS

are known as Eulers forms.Rewriting the formulas given above for E() in terms of ei we have

eiei= ei(+

), ei/ei= ei(

), (ei)n = ein,d

dei = iei.

These properties are easy to remember since they are the same as the properties of real exponentials.An important additional property of ei is that it is periodic of period 2 in , that is

ei(+2k) = ei, (17)

where k is an integer.We now complete the definition of ez for any complex number z. If z = a + ib, we define

ez = ea+ib = ea(cos b + i sin b). (18)

Using this definition it is straightforward to verify that the complex exponential obeys the following forall complex z1 and z2:

ez1ez2 = ez1+z2 , ez1/ez2 = ez1z2 , (ez1 )n = enz1 (19)

dte(a+ib)t = (a + ib)e(a+ib)t. (20)

Eulers forms allow us to write a complex number in the polar exponential form

z = x + iy = r(cos + i sin ) = rei. (21)

In computations involving multiplication, division and powers of complex numbers, the polar expo-nential form is usually the best form to use. The expression z = rei is compact and the rules formultiplication, division and powers are the usual laws of exponents; nothing new need be remembered .

Example 7. Evaluate (1 + i)6. From the simple diagram below we find that for the number1 + i, we have r =

2, and = 3/4. Therefore

(1 + i)6 = (

2ei3/4)6

= (

2)6ei18/4

= 8ei/2 ( Using equation (17) )= 8i.

It is helpful to think of the complex number ei as a unit vector at an angle . By visualizing thenumbers 1, i, 1, and i as vectors one easily finds

1 = ei0, i = ei/2, 1 = ei, i = ei3/2 = ei/2.

Likewise one should think of rei as a vector of length r in the direction .Suppose z(t) is a complexvalued function of the real variable t. If x(t) = Re z(t) and y(t) =

Im z(t), thenz(t) = x(t) + iy(t).

• Section 1.7Complex Numbers 25

If the functions have derivatives then

d

dtz(t) =

d

dtx(t) + i

d

dty(t).

In other words

If x(t) = Re z(t) then ddt

x(t) = Re ddt

z(t) and if y(t) = Im z(t) then ddt

y(t) = Im ddt

z(t).

We now make use of these simple observations to derive a compact formula for nth derivative of cos at(a is real). Since cos at = Re eiat we have

dn

dtncos at = Re ( d

n

dtneiat)

= Re ((ia)n eiat)= Re (inan eiat)= Re ((ei/2)naneiat)= Re (anein/2eiat)= Re (anei(at+n/2))= an cos (at + n/2).

As another example we compute

eax sin bxdx, where a and b are real. We have

eax sin bxdx =

Im (eaxeibxdx)

= Im

eaxeibxdx

= Im

e(a+ib)xdx

= Im e(a+ib)x

a + ib= Im e

axeibx

a + ib

= Im eax(cos ax + i sin ax)

a + ib a iba ib

= Im eax (a cos ax + b sin ax) + i(a sin bx b cos bx)a2 + b2

= eax(a sin bx b cos bx)

a2 + b2.

Roots of Complex Numbers

Let w = a + ib '= 0 be a given complex number. We seek the nth roots of w, that is, all numbers zsuch that zn = w. We write w in polar form: w = Rei where R and are known. Let z = rei wherer and must be found to satisfy zn = w. We have

zn = (rei)n = Rei, or rnein = Rei. (22)

Therefore rn = R and n = , or r = n

R (the positive nth root of R) and = /n. This yields oneroot

z0 =n

R ei/n = n

R (cos/n + i sin/n).

• 26 Chapter IDIFFERENCE EQUATIONS

However there are more roots. Note that, for any integer k, ei = ei(+2k); in other words the angle is only determined up to a multiple of 2. Now (26) becomes:

(rei)n = Rei(+2k), or rnein = Rei(+2k). (23)

We now have that rn = R or r = n

R as before, but n = + 2k or = ( + 2k)/n. If we letk = 0, 1, 2 . . . , n 1, we obtain n values of which yield n distinct nth roots of w

zk =n

R ei+2k

n = n

R (cos + 2k

n+ i sin

+ 2kn

), k = 0, 1, 2, . . . , n 1. (24)

We see that every non-zero complex number has exactly n distinct nth roots, furthermore, theseroots divide the circle of radius n

R into n equal parts.

Example 8. Find the cube roots of 1.Solution 11/3 = (1 + i0)1/3 = (1 ei(0+2k))1/3 = 1 ei2k/3, k = 0, 1, 2

For k = 0, we have z0 = ei0 = 1,

for k = 1, we have z1 = ei2/3 = cos 2/3 + i sin 2/3,

= 1/2 + i

3/2

for k = 2, we have z2 = ei4/3 = cos 4/3 + i sin i4/3

= 1/2 i

3/2.

These roots are represented geometrically in the figure above.

Example 9. Find the cube roots of 1.Solution (1)1/3 = (1 + i0)1/3 = (1 ei(+2k))1/3 = 1 ei(+2k)/3 = ei(/3+2k/3, k = 0, 1, 2

For k = 0, we have z0 = ei/3 = cos/3 + i sin/3,

= 1/2 + i

3/2

for k = 1, we have z1 = ei = cos + i sin = 1,for k = 2, we have z2 = ei5/3 = cos 5/3 + i sin 5/3

= 1/2 i

3/2.

Exercises 1.7

1. Evaluate using exact arithmetic; write all answers in the form a + ib:

a.1 ii 1 b.

(4 5i)2

2 3i c.12i

(i7 i7) d. i757 e.( 1

2+ i

3

2)3

2. Proceeding as in example 5, find all square roots of i, that is find all complex z = x + iy such thatz2 = i.

3. Show that for complex z1 and z2a. |z1z2| = |z1| |z2| b. |z1| =

z1z1 c. |z1 + z2| |z1| + |z2|.

4. Show that for complex z1 and z2a. z1z2 = z1z2 b. (z1 + z2) = z1 + z2.

• Section 1.8Fibonacci Numbers 27

5. Using De Moivres theorem for n = 2: (cos + i sin )2 = cos 2 + i sin 2, find sin 2 and cos 2 interms of sin and cos .

6. Calculate exactly, using the polar exponential form, and put answers in the form a + ib:

a. (1 i)19 b. (12

+ i

32

)37 c. (1)n d. (i)n

7. If it is known that (a + ib)k = 2 6i, evaluatea. (a ib)k b. (a + ib)k.

8. Using a calculator evaluate (0.791+0.892i)5, rounding the real and imaginary parts of the answerto three significant digits.

9. Using complex exponentials find

a.

ex cos 2xdx b. the nth derivative of sin 2x c.d15

dx15(e

2

2x

sin

22

x).

10. Evaluate all roots exactly in the form a + iba. (1)1/4 b. 11/4 c. i1/2 d. (3 3i)2/3.

11. Find all roots of (23i)1/3, rounding the real and imaginary parts of the answers to three significantdigits.

1.8Fibonacci Numbers

As our first example of a second order difference equation we consider a problem that dates back to anItalian mathematician, Leonardo of Pisa, nicknamed Fibonacci (son of good nature), who lived in thethirteenth century. He proposed the following problem

On the first day of a month we are given a newly born pair of rabbits, how many pairs willthere be at the end of one year? It is assumed that no rabbits die, that rabbits begin to bearyoung when they are two months old and produce one pair of rabbits each month.

Let fn = number of pairs of rabbits at end of nth month, n = 1, 2, . . . . We know that f1 = 1, f2 = 1.We have the obvious relation

fn = number of pairs of rabbits at end of the (n 1)th month + births during nth month

The births during the nth month must be produced by pairs of rabbits that are at least two monthsold and there are exactly fn2 such pairs of rabbits. Therefore, we have to solve

E: fn = fn1 + fn2, n = 2, 3, 4, . . .IC: f1 = 1, f2 = 1.

(1)

It is easy to see that fn is uniquely determined for all relevant n. It is also easy to write down the firstfew terms of the sequence fn

f1 = 1, f2 = 1, f3 = 2, f4 = 3, f5 = 8.

Each term is the sum of the preceding two terms. These numbers are called Fibonacci numbers; thefirst twelve Fibonacci numbers are

1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144.

The number of pairs of rabbits at the end of one year is therefore 144.

• 6

5

4

3

2

1

A Fibonacci Tree

28 Chapter IDIFFERENCE EQUATIONS

Suppose we wish to find fn for arbitrary n, then we must solve the difference equation. We try toguess at a solution. After some reflection, the type of sequence most likely to produce a solution is

fn = n, n = 1, 2, . . . .

Substituting into fn = fn1 + fn2 we find

n = n1 + n2, or n2(2 1) = 0. (2)

Thus any satisfying 2 1 = 0 will yield a solution. The solutions of the quadratic equation are

1 =1 +

5

2, 2 =

1

52

. (3)

Thus, as may be easily checked, n1 and n2 are two solutions of the difference equation. Furthermore,we find that if c1, c2 are any constants, then

fn = c1n1 + c2n2 , n = 1, 2, . . . (4)

is also a solution. We now try to find c1, c2 so that the initial conditions f1 = 1, f2 = 1 are satisfied.We find that c1, c2 must satisfy

1 = c11 + c221 = c121 + c2

22

Solving these linear equations we find c1 = 1

5, c2 = 1

5, therefore our solution is

fn =15

(1 +

5

2

)n 1

5

(1

5

2

)n, n = 1 , 2, . . . (5)

This is a general formula for the nth Fibonacci number.From the initial value problem (1), it is clear that fn is always an integer; however this is not at

all obvious from the general formula (5).Curiously, a number of natural phenomena seem to follow the Fibonacci sequence, at least ap-

proximately. Consider the branching of a tree; in some trees the number of branches at each level aresuccessive Fibonacci numbers as shown below

• Icosahedron Dodecahedron

Section 1.8Fibonacci Numbers 29

This would happen exactly if the main trunk of the tree sent out a branch every year starting withthe 2nd year and each branch sent out a branch every year starting in its second year.

Other examples of the Fibonacci sequence in nature are, the number of spiral florets in a sunflower,the spiraled scales on the surface of a pineapple, the position of leaves on certain trees and the petalformations on certain flowers.

One can calculate (see problem 1) that

limn

fn+1fn

=1 +

5

2= 1.618 . . . (6)

This number is often called the golden mean. The golden mean was thought by the ancient Greeks tobe the ratio of the sides of the rectangle that was most pleasing to the eye. The golden mean occursin several places in geometry, for instance, the ratio of the diagonal of a regular pentagon to its edgeis the golden mean. A regular icosahedron contains 20 faces, each an equilateral triangle; it can beconstructed from three golden rectangles intersecting symmetrically at right angles. Connecting thevertices of the rectangles, one obtains the regular icosahedron as shown below. A regular dodecahedronhas twelve regular pentagonal faces. The midpoints of the faces are the vertices of a regular icosahedron,as shown in the figure below.

Exercises 1.8

1. Show limn

fn+1fn

=1 +

5

2.

2. If a line segment of unit length is divided so that the whole is to the larger as the larger is to thesmaller, show these ratios are the golden mean.

3. Solve and check the solution ofE: xn+2 + 5xn+1 + 6xn = 0, n = 0, 1, 2, . . .IC: x0 = 0, x1 = 1.

• 30 Chapter IDIFFERENCE EQUATIONS

1.9Second Order Linear Difference Equations

A second order linear difference equation is one of the form

anxn+2 + bnxn+1 + cnxn = fn, n = 0, 1, 2, . . . (1)

where {an} , {bn} , {cn} and {fn} are given sequences of real or complex numbers and {xn} is thesequence we wish to find. We shall also assume that an '= 0 for n 0. This assures that the initialvalue problem

E: anxn+2 + bnxn+1 + cnxn = fn, n 0IC: x0 = 0, x1 = 1.

(2)

has a unique solution for each choice of 0, 1 (see Theorem 1 of Section 13).It is convenient to introduce an abbreviation for the left hand side of (1).

L(xn) = anxn+2 + bnxn+1 + cnxn , n = 0, 1 , 2, . . . (3)

For a given sequence {xn}, L(xn) is another sequence; L is an operator which maps sequences intosequences. For example, suppose L is defined by L(xn) = xn+2 + xn, then L(n2) = (n + 2)2 + n2 =2n2 +4n+4 and L maps the sequence {n2} into the sequence {2n2 +4n+4}. Also L(in) = L

(ein/2

)=

ei(n+2)/2 + ein/2 = e in2 + e in2 0. Thus L maps the sequence {in} into the zero sequence; this ofcourse means that xn = in is a solution of xn+2 + xn = 0.

Using the operator L defined by (3), the difference equation (1) can be written in the abbreviatedform

L(xn) = fn, n = 0, 1, 2, . . . (4)

Or, if fn = 0 for all n, we haveL(xn) = 0. (5)

If fn is not identically zero (4) is called a nonhomogeneous difference equation and (5) is called theassociated homogeneous equation.

In the next two sections we shall show how to solve equation (1) in the special case when thecoefficient sequences an, bn, cn are all constant. Methods for solving (2) when the coefficients are notconstant are beyond the scope of this book. In this section we shall discuss properties of solutions whichwill be needed for the constant coefficient case but are just as easy to demonstrate for the general linearequation (2).

We start with a simple, but fundamental, property of the operator L.

Theorem 1. The operator L is linear that is

L(xn + yn) = L(xn) + L(yn). (6)

for any constants , and any sequences xn, yn.Proof By direct computation we find

L(xn + yn) = an(xn+2 + yn+2) + bn(xn+1 + yn+1) + cn(xn + yn)= (anxn+2 + bnxn+1 + cnxn) + (anyn+2 + bnyn+1 + cnyn)= L(xn) + L(yn).

An useful property of solutions of the homogeneous equation is given in the following theorem.

Theorem 2. If un and vn are solutions of the homogeneous equation L(xn) = 0 then un + vnis also a solution for constant values of ,.

• Section 1.9Second Order Linear Difference Equations 31

Proof By hypothesis we have L(un) = 0 and L(vn) = 0. From Theorem 1 we have

L(un + vn) = L(un) + L(vn) = 0 + 0 = 0.

Example 1. It can be verified that 2n and 3n are solutions of xn+2 5xn+1 + 6xn = 0. Thus thetheorem guarantees then xn = 2n + 3n is also a solution. The question arises, is this the generalsolution or are there other solutions? To answer this question we need the notion of linear independenceof two sequences.

Definition 1. Two sequences {un} , {vn}, n = 0, 1, , . . . are linearly dependent (LD) it ispossible to find two constants c1 and c2, not both zero such that un + vn 0, n = 0, 1, 2 . . .. Thesequences are called linearly independent (LI) if they are not linearly dependent, i.e., if un + vn 0, n = 0, 1, 2 . . . then = = 0.

Saying this another way {un} , {vn}, are LD if and only if one sequence is a multiple of (or dependson) the other and LI otherwise. One can usually tell by inspection whether two sequences are LI or LD.For example the sequences un = n, vn = n2 are LI while un = 2n, vn = 5n are LD. Also if un 0, thenun, vn are LD no matter what the sequence vn is. A useful test for LI of two solutions of L(xn) = 0 isgiven in the following theorem.

Theorem 3. Two solutions of un, vn of L(xn) = 0, n 0, are linearly independent if and onlyif

u0 v0u1 v1

= u0v1 u1v0 '= 0

Proof Assume that un, vn are LI solutions of L(xn) = 0. Supposeu0 v0u1 v1

= 0. Then the

equations

u0 +v0 = 0u1 +v1 = 0

have a nontrivial solution. Consideryn = un + vn

where and are determined above. We have that yn is a solution of L(xn) = 0, with y0 = 0 andy1 = 0. Therefore by the uniqueness of solutions of the initial value problem, yn 0. This means thatun, vn are LD, contrary to assumption.

Conversely, assume thatu0 v0u1 v1

'= 0. Suppose that un, vn are LD. Then there exists constants

and , not both zero, so thatun + vn 0, n 0

In particular this must be true for n = 0 and n = 1. Thus the system of equations

u0 +v0 = 0u1 +v1 = 0

has a nontrivial solution. This means that the determinant of the coefficients is zerou0 v0u1 v1

= 0,

contrary to assumption.We are now in a position to prove the following fundamental fact.

Theorem 4. If un and vn are two LI solutions of the second order linear homogeneous differenceequation, L(xn) = 0, n = 0, 1, . . ., then xn = un + vn is the general solution of L(xn) = 0.

• 32 Chapter IDIFFERENCE EQUATIONS

Proof Let wn be any solution of L(xn) = 0. We must show that wn = un + vn for suitableconstants and . Now w0 and w1 are definite numbers, we determine , so that

w0 = u0 + v0w1 = u1 + v1.

Since the solutions are LI, the determinant of the coefficients is not zero, i.e.,u0 v0u1 v1

'= 0.

we can solve uniquely for and . With these values of , define zn = un + vn. Note thatL(zn) = 0 and z0 = w0 and z1 = w1. Therefore zn and wn are solutions of the same difference equationand the same initial conditions. Since there is a unique solution to this initial value problem we musthave zn wn un + vn.

Example 2. In Example 1 we found that 2n and 3n were solutions of xn+2 5xn+1 + 6xn = 0.Since 2n, 3n are LI we now know that xn = 2n + 3n is the general solution. In particular, we mayfind solutions satisfying any initial conditions. Suppose x0 = 1, x1 = 2 then we must solve

1 = + 2 = 2 + 3

which yields the unique values = 1, = 0 so the unique solution satisfying the initial conditions isxn = 2n.

Theorem 4 reduces the problem of finding the general solution of L(xn) = 0 to finding a LI set ofsolutions. In the next section we will show how this is done if L has constant coefficients.

We now turn to the general properties of solutions of the nonhomogeneous equation.

Theorem 5. If yn and zn are both solutions of the same nonhomogeneous equation L(xn) = fnthen xn = yn zn is a solution of the homogeneous equation L(xn) = 0.

Proof L(xn) = L(yn zn) = L(yn) L(zn) fn fn = 0.

Theorem 6. If x(h)n is the general solution of L(xn) = 0 and x(p)n is any one (or particular)

solution of L(xn) = fn then xn = x(h)n + x(p)n is the general solution of L(xn) = fn.

Proof Let xn be any solution of L(xn) = fn and let x(p)n be a particular solution of the same

equation. By Theorem 5, xn x(p)n is a solution of L(xn) = 0 therefore xn x(p)n = x(h)n as desired.

Example 3. Find the general solution of

xn+2 5xn+1 + 6xn = 4

From Example 2 we know that x(h)n = 2n + 3n. According to Theorem 5 we need only find any onesolution of the difference equation. We look for the simplest solution. The fact that the right hand sideis a constant suggests that we try x(p)n A (a constant). Then x(p)n+1 = A and x

(p)n+2 = A and substituting

into the equation we findA 5A + 6A = 4, or A = 2

Thus x(p)n = 2 and the general solution is xn = 2n + 3n + 2.

Example 4. Find the general solution of

xn+2 5xn+1 + 6xn = 2 5n

• Section 1.9Second Order Linear Difference Equations 33

The homogeneous equation is the same as in example 3. The right hand side of the equation suggeststhat a particular solution of the form x(P )n = A 5n may exist. Substituting into the difference equationwe find

A 5n+2 5(A 5n+1) + 6(A 5n) = 2 5n

Note that 5n is a common factor on both sides. Dividing both sides by this factor we find A = 1/3 orx(p)n = 5n/3. Adding this to the homogeneous solution provides the general solution.

Our final theorem allows us to break up the solution of L(xn) = fn +gn into the solution of twosimpler problems.

Theorem 7. If yn is a solution of L(xn) = fn and zn is a solution of L(xn) = gn then yn + znis a solution of L(xn) = fn + gn.

Proof L(yn + zn) = L(yn) + L(zn) = fn + gn.

Example 5. Find a particular solution of

xn+2 5xn+1 + 6xn = 4 + 2 5n

Using the results of examples 3 and 4 and Theorem 6 with = = 1 we find

x(p)n = 25

+5n

3

Example 6. Find a particular solution of

xn+2 5xn+1 + 6xn = 1 5n1

Comparing the right hand side of this equation with examples 3 and 4 we can find , such that

1 = 4 and 5n1 = 2 5n

this yields = 1/4 and = 1/10. Thus

x(p)n =14 (2

5) + ( 1

10) 5

n

3.

Exercises 1.9

1. What is a solution of L(xn) = 0, x0 = 0, x1 = 0. Is there more than one solution?

2. Let L(xn) = xn+2 4xn. Compute L(xn) in each of the following casesa. xn = 3(4)n1 b. xn = 2n

3. If un is the solution of L(xn) = 0, x0 = 0, x1 = 1 and vn is the solution of L(xn) = 0, x0 = 1, x1 =0 what is the solution of L(xn) = 0, x0 = 5, x1 = 6.

4. If 2n1 is a solution of L(xn) = 5 2n+1 and 2 3n is a solution of L(xn) = 4 3n2 what is thesolution of L(xn) = 2n 3n.

5. If 2n 3n, 2n+1 3n, 2n+1 3n + 1 are each solutions of the same difference equation L(xn) = fna. What is the general solution of L(xn) = 0.

b. What is the general solution of L(xn) = fn.

• 34 Chapter IDIFFERENCE EQUATIONS

1.10Homogeneous Second Order Linear Difference Equations

We shall restrict ourselves to equations with constant coefficients. Consider the difference equation

a xn+2 + b xn+1 + c xn = 0, n = 0, 1, 2, . . . (1)

where a, b, c are real constants and both a '= 0 and c '= 0 (if a = 0 or c = 0, the equation is of firstorder).

Obviously, one solution is xn = 0 for all n; this is called the trivial solution. We look for non-trivialsolutions of the form xn = n for a suitable value of '= 0. Substituting this into (1) we find

an+2 + bn+1 + cn = 0 or n(a2 + b + c) = 0.

If this is to hold for all n, must satisfy

a2 + b + c = 0 (2)

which is called the characteristic equation associated with (1).Solving this quadratic equation we obtain

1 =b +

b2 4ac

2a, 2 =

b

b2 4ac2a

(3)

and the corresponding solutions n1 and n2 of the difference equation (1). Note that if b2 4ac > 0, 1and 2 are real and distinct, if b2 4ac = 0 then 1 = 2 and if b2 4ac < 0 the roots 1 and 2 areconjugate complex numbers. We analyze each of these situations.

(a) Real distinct roots (b2 4ac > 0).

We know from (3) that 1, 2 are real and 1 '= 2, thus un = n1 and vn = n2 are two solutions.Since these solutions are clearly LI, the general solution is

xn = n1 + n2 . (4)

(b) Real equal roots (b2 4ac = 0).

We have only one root 1 = b/2a and a corresponding solution un = n1 . We must find a secondsolution. Of course n1 is also a solution but the set {n1 , n1} is not a LI set. It turns out that asecond LI solution is

vn = nn1 .

Let us verify this

a vn+2 + b vn+1 + c vn = a(n + 2)n+21 + b(n + 1)n+21 + b(n + 1)

n+11 + c n

n1

= n1{n(a21 + b1 + c) + 2a21 + b1}

but a21+b1+c = 0 since 1 is a root of this equation; also 2a21+b1 = 0 since 1 = b/2a. Therefore,the right-hand side of the above equation is 0 and vn = nn1 is a solution. Clearly {n1 , nn1} is a LI set.Thus the general solution is

xn = n1 + nn1 . (5)

(c) Complex roots (b2 4ac < 0).

We can write the roots of the characteristic equation as

1 =b + i

4ac b2

2a, 2 =

b i

4ac b22a

.

• Section 1.10Homogeneous Second Order Linear Difference Equations 35

Since a, b, c are real numbers, 1,2 are complex conjugate pairs;

1 = + i, 2 = i

where = b/2aa, =

4ac b2/2a and '= 0.Writing these roots in polar exponential form

1 = rei, 2 = rei, '= m (m an integer)

we obtain the complex valued solutions

z(1)n = n1 = r

nein and z(2)n = n2 = r

nein.

These form a LI set and the general complex valued solution is

xn = c1rnein + c2rnein (6)

where c1, c2 are any complex numbers.We usually desire real solutions. To get these we take appropriate linear combinations of the

complex solutions to get the real solutions x(1)n and x(2)n

x(1)n =z(1)n + z(2)n

2= rn

ein + ein

2= rn cos n

x(2)n =z(1)n z(2)n

2i= rn

ein ein

2i= rn sin n

These solutions are clearly LI, thus, the general real valued solution is

xn = rn( cos n + sin n) (7)

with , arbitrary real constants.

Example 1. xn+2 + 5xn+1 + 6xn = 0, n = 0, 1, 2, . . .Assume xn = n to find 2 + 5 + 6 = 0 where = 3,2. The general solution is

xn = (2)n + (3)n.

Example 2. xn+2 + 2xn+1 + xn = 0, n = 0, 1, 2, . . .We have xn = n, 2 +2+1 = 0, = 1,1. Thus (1)n is one solution. The other solution

is n(1)n and the general solution is xn = (1)n + n(1)n.

Example 3. xn+2 2xn+1 + 2xn = 0, n 0.We have 2 2+ 2 = 0 or 1 = 1 + i =

2ei/4 and 2 = 1 i =

2ei/4. Therefore the general

solution in real form isxn = 2n/2

( cos

n

4+ sin

n

4

).

Exercises 1.10

In problems 18 find the general solution in real form and check them.

1. xn+2 xn = 0, n 0 2. xn+2 6xn+1 + 9xn = 0, n 03. xn + xn2 = 0, n 2 4. 4yn+1 4yn + yn1 = 0, n 15. yn + 4yn1 + 4yn2 = 0, n 2 6. xn+2 + xn+1 + xn = 0, n 07. xk+2 + 2xk+1 + 2xk = 0, k 0 8. 6xn+1 7xn + 4xn1 = 0, n 3764

• 36 Chapter IDIFFERENCE EQUATIONS

9. Solve E: xk+2 + 2xk+1 + 2xk = 0, k 0IC: x0 = 0, x1 = 1

10. Solve E: xn+2 + xn = 0, n 0IC: x0 = 1, x1 = 1

11. Solve E: xn+1 2xn cos(/7) + xn1 = 0, n 1IC: x0 = 1, x1 = cos(/7) + sin(/7)

12. Find the second order homogeneous equation with constant coefficients that have the following asgeneral solutions.

a. xn = (1)n + 3n b. xn = ( + n)4nc. xn = 2n/2( cos 3n4 + sin

3n4 ) d. xn = n

1.11The Method of Undetermined Coefficients

We recall that the general solution of the non-homogeneous equation

axn+2 + bxn+1 + cxn = fn, n = 0, 1, 2, . . . (1)

is xn = x(h)n +x(p)n where x(h)n is the general solution of the homogeneous equation and x(p)n is a particular

solution of the non-homogeneous equation. In section 8, we indicated how to find x(h)n . We shall considerhow to find a particular solution when fn has one of the following forms

(a) fn = k, a constant(b) fn = a polynomial in n(c) fn = kn, k, constants

or a sum of terms of these types.

Example 1. xn+2 + 5xn+1 + 6xn = 3The homogeneous solution is x(h)n = c1(2)n + c2(3)n. We assume x(p)n = A, a constant, where A

must be determined. Substituting into the differences equation we find

A + 5A + 6A = 3 or A = 1/4

Thus x(p)n = 1/4 andxn = c1(2)n + c2(3)n + 1/4

Example 2. xn+2 + 5xn+1 + 6xn = 1 + 2n. Assume x(p)n = A + Bn. Upon substitution we get

A + B(n + 2) + 5(A + B(n + 1)) + 6(A + Bn) = 1 + 2n

12B = 2, 12A + 7B = 1 or B = 1/6, A = 1/72.

Thusx(p)n =

172

+n

6.

Example 3. xn+2 + 5xn+1 + 6xn = 2 4n. Assume x(p)n = A4n. We find

A4n+2 + 5A4n+1 + 6A4n = 2 4n

(16A + 20A + 6A)4n = 2 4n

42A = 2, A = 1/21.

• Section 1.11The Method of Undetermined Coefficients 37

Thus x(p)n =4n

21.

Example 4. xn+2 + 5xn+1 + 6xn = 3(2)n. Recall that x(h)n = c1(2)n + c2(3)n. If we assumex(p)n = A(2)n, this will not work. The reason it will not work is that the assumed form is a solution ofthe homogeneous equation; it cannot also be a solution of the non-homogeneous equation. We modifyour assumption slightly to x(p)n = An(2)n. We find

A(n + 2)(2)n+2 + 5A(n + 1)(2)n+1 + 6An(2)n = 3(2)n

A(2)n{(n + 2)(2)2 + 5(n + 1)(2) + 6n} = 3(2)n

A(2)n{4n + 8 10n 10 + 6n} = 3(2)n

2A(2)n = 3(2)n, or A = 3/2.

Thus x(p)n = 3n(2)n/2 and xn = c1(2)n + c2(3)n 32n(2)n.

Example 5. xn+2 2xn+1 + xn = 1. We find x(h)n = c1 + c2n. We first try x(p)n = A, but this isa solution of the homogeneous equation. We modify it to x(p)n = An, but this is also a solution. Wemodify it again to x(p)n = An2; this will work.

A(n + 2)2 2A(n + 1)2 + An2 = 1.

We find that the n2 terms and the n terms cancel out on the left and we have 2A = 1 orA = 1/2.Thus x(p)n = n2/2 and

xn = c1 + c2n +n2

2.

Example 6. xn+2 + 5xn+1 + 6xn = 3(2)n. Recall that x(h)n = c1(2)n + c2(3)n. If we assumex(p)n = A(2)n, this will not work. For this assumed form is a solution of the homogeneous equation;it cannot also be a solution of the non-homogeneous equation. We modify our assumption slightly tox(p)n = An(2)n. We find

A(n + 2)(2)n+2 + 5A(n + 1)n+1 + 6An(2)n = 3(2)n

A(2)n{(n + 2)(2)2 + 5(n + 1)(2) + 6n} = 3(2)n

A(2)n{4n + 8 10n + 6n} = 3(2)n

2A(2)n = 3(2)n or A = 3/2.

Thus x(p)n = 3n(2)n/2 and xn = c1(2)n + c2(3)n 32 n(2)n.

Example 7. xn+2 2xn+1 + xn = 1.We find x(h)n = c1 + c2n. We first try x

(p)n = A but this is a solution of the homogeneous equation,

we modify it to x(p)n = An, but this is also a solution. We modify it again to x(p)n = An2; this will work:

A(n + 2)2 2A(n + 1)2 + An2 = 1.

We find that the terms involving n2 and n cancel out and we are left with

2A = 1 or A = 1/2.

Thus x(p)n = n2/2 and

xn = c1 + c2 n +n2

2.

• 38 Chapter IDIFFERENCE EQUATIONS

We summarize the procedure.

Method. To find a particular solution of

axn+2 + bxn+1 + cxn = fn

where fn is a polynomial of degree d, assume x(p)n is an arbitrary polynomial of degree d. If any term

is a solution of the homogeneous equation, multiply by n; if any term is still a solution multiply by n2.

If fn = kn, assume x(p)n = An. If this is a solution of the homogeneous equation use x(p)n = Ann;

if this is also a solution of the homogeneous equation use x(p)n = An2n.

Example 8. We illustrate the above method with a few examples.

Homogeneous solution fn Proper Form for x(p)n

c12n + c23n 5n2 A + Bn + Cn2c1 + c2 3n 3 + 5n2 (A + Bn + Cn2)nc1 + c2n 3 + 5n2 (A + Bn + Cn2)n2c1 2n + c2 3n 2 5n1 A5n or A5n1c1 2n + c2 3n 2 3n1 An3nc1 2n + c2n2n 5 2n An22nc1 2n + c2 3n 6 2n A2n

Exercises 1.11

1. Find the general solutions ofa. xn+2 + xn+1 2xn = 3 b. xn+2 + xn+1 2xn = 5 4n2c. xn+2 + xn+1 2xn = (2)n+1.

2. Using the results of problem 1, find a particular solution ofa. xn+2 + xn+1 2xn = 3 + 5 4n2 + (2)n+1 b. xn+2 + xn+1 2xn = 1 5 4n + (2)n+2.

3. Solve E: xn+1 + 5xn + 6xn1 = 12

IC: x0 = 0, x1 = 0.4. Solve E: xn+1 2xn + xn1 = 2 3n

IC: x0 = 0, x1 = 1.5. Find a second order difference equation whose general solution is

a. c12n + c2(3)n + 5 4n b. c1 + c2(3)n + 1 + 2nc. 2 n2 (c1 cos 3n4 + c2 sin

3n4 ) + 3.

6. A sequence starts off with x1 = 0, x2 = 1 and thereafter each term is the average of the twopreceding terms. Find a formula for the nth term of the sequence.

7. Write down the proper form for a particular solution ofa. xn+1 + 5xn + 6xn1 = n + 2(3)n2 b. xn+2 2xn+1 + xn = 2 n + 3nc. xn+2 3xn+1 + 2xn = n n2 + 3n 2n.

• Section 1.12A Simple Model of National Income 39

1.12A Simple Model of National Income

We shall study a simple mathematical model of how national income changes with time. Let Yn be thenational income during the nth period. We assume that Yn is made up of three components:

(1) Cn = consumer expenditures during the nth period(2) In = induced private expenditures during the nth period(3) Gn = governmental expenditures during the nth period.

Since we are assuming that these are the only factors contributing to national income, we have thesimple accounting equation

Yn = Cn + In + Gn. (1)

Following Samuelson, we make three additional assumptions(4) Consumer expenditures in any period is proportional to the national income of the pre-

ceding period.(5) Induced private investment in any period is proportional to the increase in consumption

of that period over the preceding period (the so-called acceleration principle)(6) Government expenditure is the same in all periods.

We restate these assumptions in mathematical terms. If we denote the constant of proportionalityin (4) by a, we have

Cn = a Yn1 (2)

The positive constant a is called the marginal propensity to consume. The constant of proportionalityin assumption (5), we denote by b and we have the equation

In = b(Cn Cn1). (3)

The positive constant b is called the relation. If consumption is decreasing, then (Cn Cn1) < 0 andtherefore In < 0. This may be interpreted to mean a withdrawal of funds committed for investmentpurposes, for example, by not replacing depreciated machinery. Finally, assumption (6) states thatGn is a constant, and we may as well assume that we have chosen our units so that the governmentexpenditure is equal to 1 (these days 1 stands for about 1 trillion dollars). Thus

Gn = 1, for all n (4)

Substituting equations (2), (3), and (4) into (1) we obtain a single equation for the national income

Yn = a Yn1 + b(Cn Cn1) + 1= a Yn1 + b(aYn1 aYn2) + 1

(5)

or finallyYn a(1 + b)Yn1 + abYn2 = 1, n = 2, 3, . . . (6)

Let us analyze a particular case when a = 1/2, , b = 1 and assume that Y0 = 2 and Y1 = 3. Thuswe have to solve the following initial value problem:

Yn Yn1 + Yn2/2 = 1, n = 2, 3, . . .Y0 = 2, Y1 = 3.

(7)

The general solution of (7) is

Yn = (1/

2)n {A cos(n/4) + B sin(n/4)} + 2. (8)

• 0

1

2

3

4

0 4 8 1 2 1 6 2 0

Yn

40 Chapter IDIFFERENCE EQUATIONS

Figure 1

From the initial conditions we find that A = 0 and B = 2, thus the solution is

Yn = 2(1/

2)nsin (n/4) + 2.

The presence of the sine term in (9) makes Yn an oscillating function of the time period n. Since1/

2 < 1, the amplitude of the oscillations decrease as n increases and the first term on the right of (9)approaches zero. The sequence Yn therefore approaches the limit 2 as n approaches infinity. A graphof Yn is shown in Figure 1.

We conclude that a constant level of government expenditures results (in this special case) indamped oscillatory movements of national income which gradually approach a fixed value.

We now consider the case when a = 0.8 and b = 2. The difference equation (6) now becomes

Yn 2.4Yn1 + 1.6Yn2 = 1. (9)

The general solution of this equation is

Yn = (

1.6)n(c1 cos n + c2 sin n) + 5. (10)

where = arctan(0.4/1.2). We note that Yn has an oscillatory character but since

1.6 > 1, the factor(

1.6)n causes the oscillations to increase in amplitude.The two special cases we have considered show that we can get very different behavior of the

national income for different values of the parameters a and b. We analyze the situation in general.Consider the difference equation (6). Let 1 and 2 be roots of the characteristic equation

2 a(1 + b) + ab = 0. (11)

The general solution of the difference equation (6) has one of the following three forms.

Yn = c1(1)n + c2(2)n + 1/(1 a), 1,2 real and distinct (12)

Yn = c1(1)n + c2n(1)n + 1/(1 a), 1 = 2 (13)

Yn = rn(c1 cos(n) + c2 sin(n)) + 1/(1 a), 1 = rei. (14)

• Section 1.13The Gamblers Ruin. 41

In all cases, in order for Yn to have a limit as n approaches infinity, for any possible choices of c1 andc2 it is necessary and sufficient that |1| < 1 and |2| < 1. It can be shown that this will happen if andonly if the positive parameters a and b satisfy the two conditions

a < 1 and ab < 1. (15)

If a and b satisfy these conditions the national income will approach the limit 1/(1a) as n approachesinfinity independent of the initial conditions.

Exercises 1.12

Find and sketch the solution to equation (6) under the following assumptions:

1. a = 34 , b =13 , Y0 = 1, Y1 = 2.

2. a = 58 , b =15 , Y0 = 1, Y1 = 2.

1.13The Gamblers Ruin.

We enter into a game of chance with an initial holding of c dollars, our adversary begins with ddollars. At each game we will win one dollar with probability p and lose one dollar with probabilityq = 1 p, 0 < p < 1. The gamblers ruin problem is to determine the probability of our ultimate ruin,that is the probability that we end up with zero if we keep playing the game.

Let Pn = the probability of ruin given that we now hold n dollars. In particular we are interestedin determining Pc, that is the probability of ruin if we start with c dollars. Since if we start with zerodollars we are already ruined we have the condition

P0 = 1. (1)

Also if we end up with all the money, namely (c+d) dollars, the game is over and we have no possibilityof being ruined. Thus we have the condition

Pc+d = 0. (2)

We can set up a difference equation for Pn by noting the following:

The probability of ruin, Pn, is equal to the probability of winning the next game followed byeventual ruin plus the probability of losing the next game followed by eventual ruin.

Translating this into mathematical language we have

Pn = pPn+1 + qPn1, n = 1, 2, . . . , c + d (3)

together with the boundary conditions

P0 = 1 and Pc+d = 0. (4)

This is a homogeneous second order difference equation with constant coefficients. Assume Pn = nand substitute into (3) to find

n = pn+1 + qn1 or = p2 + q or p2 + q = 0.

Recalling that q = 1 p the roots of this quadratic equation are

1 = 1 and 2 = q/p. (5)

• 42 Chapter IDIFFERENCE EQUATIONS

If p '= 1/2, the roots are distinct and the general solution is

Pn = A + B(q/p)n. (6)

Using the boundary conditions P0 = 1 and Pc+d = 0, we get the following equations for the constantsA and B:

1 = A + B, 0 = A + (q/p)c+dB. (7)

Solving for A and B we find

A =(q/p)c+d

1 (q/p)c+d , B =1

1 (q/p)c+d . (8)

Thus we have

Pn =(q/p)n (q/p)c+d

1 (q/p)c+d . (9)

In a typical Las Vegas slot machine we would have p = 0.4 and q = 0.6. If you had \$100 (c = 100)and the house had \$1000 (d = 1000) you would find that

P100 =(1.5)100 (1.5)1100

1 (1.5)1100 = 1.00000.

This result is correct to many decimal places. In other words, if you play long enough you will lose!!

Exercises 1.13

1. In a fair game of chance, p = q = 1/2.

a. Find the solution of the gamblers ruin problem in this case, that is find Pc.

b. If c = d, what is Pc?

c. If c = 100 and d = 1000, what is Pc? What conclusion can you draw for such a fair game?

• CHAPTER II

DIFFERENTIAL EQUATIONS AND THE LAPLACE TRANSFORM

2.1Introduction

The main aim of this chapter is to develop the method of Laplace Transforms for solving certain lineardifferential equations. However, first we shall review some of the elementary differential equationsstudied in calculus.

A differential equation (DE) is an equation involving the derivative of an unknown function.The order of a DE is the order of the highest derivative appearing the equation. The following areexamples of differential equations.

dy

dx= x2, < x < (1)

dy

dt= ry, t 0 (2)

x + 5x + 6x = 0, t 0 (3)

The first two equations are first order, the third equation is of second order.By a solution of a DE of the nth order we mean a function, defined in some interval J, which pos-

sesses n derivatives and which satisfies the differential equation identically in J. The general solutionof a DE is the set of all solutions.

Let us discuss each of the DEs above. One solution of (1) is y = x3/3 sincedy

dx=

d

dx(x3/3) x2.

From calculus we know that y = x3/3 + c, where c is an arbitrary constant, represents all solutions of(1), therefore this is the general solution. Here the interval J is the entire real axis < x < .

For the DE (2), we may use the method of separation of variables.

dy

y= rdt, or ln |y| = rt + c, |y| = ert+c = ecert , y = kert, where k = ec

It is easy to verify that y = kert is a solution for all t, where k is any constant. In fact, this is thegeneral solution.

For the DE (3), we note that since the equation is linear, homogeneous and has constant coefficients,we should look for solutions of the form et for an appropriate constant . Substituting into the DE wefind x + 5x + 6x = (2 + 5 + 6)et = 0. Thus 2 + 5 + 6 = 0. This yields the two values = 3 and = 2, and the two solutions e3t and e2t. Since the equation is linear and homogeneous, it followsthat x = c1e3t + c2e2t is also a solution for arbitrary constants c1 and c2; it can be shown that this isthe general solution.

Generally speaking differential equations have infinitely many solutions as in the examples above.Notice that the general solution of the first order DEs (1) and (2) contain one arbitrary constantand the general solution of the second order DE (3) contains two arbitrary constants. This is theusual situation. In order to obtain a unique solution it is necessary to require that the solution satisfysubsidiary initial or boundary conditions.

Example 1. Find the solution of the DEdy

dx= x2 that satisfies the initial condition (IC) y(0) = 5.

• 44 Chapter IIDifferential Equations and the Laplace Transform

We know that the general solution of the DE is y = y(x) =x3

3+ c. Thus y(0) = 5 = 0 + c, and

c = 5. The unique solution to the initial value problem is y =x3

3+ 5.

Example 2. Solve DE: x + 5x + 6x = 0IC: x(0) = 2, x(0) = 5.

We have seen above that the general solution of the DE is x = c1e3t +c2e2t. Thus x = 3c1e3t2c2e2t. Putting in the ICs we obtain

2 = c1 + c25 = 3c1 2c2

Solving these we obtain c1 = c2 = 1 and the solution x = e3t + e2t.

Exercises 2.1

1. Show that u = e2t is a solution of u 4u = 0.

2. Is y = e2x a solution of ydy

dx+ y2 = e4x?

3. Find the value of the constant a, if any, so that y = ax3 is a solution ofa. x2y + 6xy + 5y = 0 b. x2y + 6xy + 5y = x3 c. x2y + 6xy + 5y = 2x2.

4. Find the values of the constant , if any, so that ex is a solution ofa. y 4y = 0 b. y + 4y = 0

5. a. Solve DE: y = xIC: y(0) = 1

b. Solve DE: y 4y = 0IC: y(0) = 1, y(0) = 2

2.2Separation of Variables

A first order differential equation is said to have its variables separated if it is in the form

A(x) + B(y)dy

dx= 0 (1)

or in the equivalent differential form

A(x) dx + B(y) dy = 0 (2).

Here A(x) and B(x) are assumed to be given continuous functions.Suppose (x) is a solution of DE (1) for x in some interval J . Then

A(x) + B((x))(x) 0 for all x J.

This is an identity in x which may be integrated to yield

A(x) dx +

B((x))(x) dx = c

where c is an arbitrary constant. In the second integral we use the substitution y = (x), dy = (x) dxto obtain

A(x) dx +

B(y) dy = c. (3)

• t

x

x0

x02

td

Section 2.2Separation of Variables 45

Thus any solution of the DE (1) must satisfy the implicit equation (3). Conversely, assuming that thisequation determines y as a differentiable function of x, we shall show that it is a solution of the DE.Differentiating (3) we find

d

dx

(A(x) dx

)+

d

dx

(B(y) dy

)= 0. (4)

Howeverd

dx

(A(x) dx

)= A(x), and

d

dx

(B(y) dy

)=

d

dy

(B(y) dy

)dy

dx= B(y)

dy

dx.

Thus (4) becomes A(x) + B(y)dy

dx= 0, and any function determined by the implicit equation (3)

represents a solution to the given DE.

Example 1. Find and check the general solution ofdy

dx=

y cos x1 + 2y2

.

Separating variables we find1 + 2y2

ydy = cos x dx

which yieldsln y + y2 = sin x + c. (i)

To verify that this is a solution we use implicit differentiation. If (i) determines y as a differentiablefunction of x in some interval J , then we have

ln y(x) + (y(x))2 sin x + c.

Differentiating both sides with respect to x we find1y

dy

dx+ 2y

dy

dx= cos x

which, when solved fordy

dx, yields the original DE.

Example 2. Population growthunrestricted growth.Let x(t) be the number of individuals in a population at time t. Assume that the population has

a constant growth rate of r (net births per individual per year). The population satisfies

DE:dx

dt= rx, t 0

IC: x(0) = x0We may solve the DE by separation of variables.

dx

x= rdt, lnx = rt + c, x = ert+c = crtec = ertk.

Since x(0) = x0, we have k = x0 and thusx = x0ert. We see that the population increasesexponentially as shown in the figure to the right.One measure of how fast the population is growingis the doubling time, that is the time it takes forthe population to grow from x0 to 2x0. Solving2x0 = x0ert for t we obtain the doubling time of

td =ln 2r

.

Finally we note that this is the same differen-tial equation as the growth of money under con-tinuous compound interest at the nominal interestrate of r.

• t

x

x0

c

c/2

46 Chapter IIDifferential Equations and the Laplace Transform

Example 3. Population growthwith limited food supply.Usually, populations of a given species do not continue to grow exponentially but level off due a

limited food supply or other limiting factors. The simplest model that takes this into account is

DE:dx

dt= rx(c x), r > 0, c > 0

IC: x(0) = x0.

If x is between 0 and c thendx

dtis positive so that x is increasing. However when x is close to c, c x

is small so that the rate of increase of the population is small.We solve the DE by separating variables

dx

x(c x) =

r dt = rt + k

To integrate the left side, we use partial fractions to obtain

dx

x(c x) =1c

(1x

+1

c x

)dx =

1c

lnx

c x.

Therefore1c

lnx

c x = rt + k orx

c x = erctA, where we have set A = eck.

Solving for x we find

x =Ac

erct + A.

where A is an arbitrary constant. Using the initial condition x(0) = x0 we find A =x0

c x0, and the

final solution can be writtenx(t) =

c

1 +(

c

x0 1

)erct

. (5)

A sketch of the solution for 0 < x0 < cis shown on the right. Note that x continu-ally increases, has a point of inflection when xreaches c/2 and approaches c as t . Thusc is the upper limit to the population. The so-lution looks somewhat like a Sshaped curveand is often called the logistic curve.

There are three parameters in the solu-tion (5), x0, r, and c. In order to evaluatethese parameters we must know the popula-tion at three distinct times. Some years agoPearl and Read used this model for the pop-ulation of the United States using the databelow.

Year Population in Millions t1790 4 01850 23 601910 92 120

• Section 2.3Linear Differential Equations 47

The final result, where x is measured in millions and 1790 represents t = 0, is

x(t) =210

1 + 51.5 e0.03t. (6)

This formula predicts the limiting population to be 210 million, which is below the actual populationtoday. However, the PearlRead formula gave quite accurate predictions until about 1970. This il-lustrates the fact that such a simple model, which ignores many important phenomena in populationgrowth, should be used with skepticism. It may yield useful results over short time periods, but shouldnot be expected to hold over long periods.

Exercises 2.2

In problems 16 find the general solution, put in as simple form as possible and check.

1. ydy

dx+ x3 = 0 2. x2 +

1 + xy

dy

dx= 0 3. x + yex

dy

dx= 0

4.dx

dy=

yx2 + yx + xy2

5.dv

du=

4uvu2 + 1

6.dy

dx= xy2

For the next two problems, find the solution, put into as simple a form as possible and check.7. x2(1 + y2)dx + 2y dy = 0, y(0) = 1 8. x3(1 + y) dx + 3 dy = 0, y(0) = 1

9. Suppose a population grows according to the lawdx

dt= rx. If the doubling time is 60 years, what

is the birth rate (to 4 decimals). What is the population after 30 years, if the initial population is 1000.

10. The mass m(t) of a radioactive substance decreases according to the lawdm

dt= km. Find the

mass at any time if the initial mass is m0. Find the time for the initial mass to be reduced to half theinitial amount (the halflife).

11. An island in the Pacific is contaminated by radioactive fallout. If the amount of radioactive materialis 100 times that considered safe, and if the halflife of the material is 1620 years, how long will it bebefore the island is safe?

2.3Linear Differential Equations

First Order Linear DE

A first order linear DE is one of the form

dx

dt+ p(t)x = q(t), (1)

where p(t) and q(t) are given functions which we assume are continuous. To solve this equation weproceed as follows

1. Multiply the DE by the integrating factor e

p(t) dt to obtain

e

p(t) dt(

dx

dt+ p(t)x

)= e

p(t) dtq(t).

2. Rewrite the left hand side as the derivative of the product of x times the integrating factor

d

dt

(xe

p(t) dt

)= e

p(t) dtq(t).

• 48 Chapter IIDifferential Equations and the Laplace Transform

(We leave to the reader to verify that the left hand sides of items 2 and 3 are the same).3. Integrate both sides and solve for the solution x(t).

Example 1. Solvedx

dt 2tx = t. The integrating factor is e

2tdt = et2 . Multiplying through by

the integrating factor we find

et2(

dx

dt 2tx

)= tet

2, or

d

dt

(xet

2)

= tet2.

Thus

xet2=

tet

2dt + c = e

t2

2+ c,

and the final solution isx = 1

2+ cet

2.

Example 2. Solvedx

dt= ax. Rewriting in the form (1) we have

dx

dt ax = 0. The integrating

factor is eat. Proceeding as above we see that the general solution is x = ceat. This is worthwhilerememberingit is easy to see that one function whose derivative = a times itself is eat, and since theequation is linear and homogeneous, any constant times a solution is a solution.

Second Order Linear DEs

Consider a second order linear homogeneous DE with constant coefficients, that is, one of the form

ax + bx + cx = 0. (2)

where a, b and c are real constants and a (= 0.Recall that if x1(t) and x2(t) are two solutions then c1x1(t)+c2x2(t) are also solutions for arbitrary

c1 and c2. Furthermore if the two solutions are linearly independent (LI) (one is not identically equalto a constant times the other) then c1x1(t) + c2x2(t) represents the general solution.

To solve the DE (2) we look for solutions of the form x = et. Substituting into the differentialequation we find (a2 + b+ c)et 0, thus must satisfy the characteristic equation a2 + b+ c = 0.There are three cases:

Case 1Real unequal roots. If the roots of the characteristic equation are 1 (= 2, then e1t ande2t are two LI solutions and the general solution is

x = c1e1t + c2e2t.

Case 2Real equal roots. If the two roots are equal, 1 = 2, then e1t is the only solution of theassumed form. A second LI solution is te1t and the general solution is

x = c1e1t + c2te1t.

Case 3Complex conjugate roots. Let the roots be 1 = + i and 2 = i, then z1 =e(+i)t = et eit and z2 = e(i)t = et eit are two complex valued solutions. We would like realsolutions. Recall Eulers forms

ei = cos i sin

or equivalently

cos =ei + ei

2and sin =

ei ei

2i

• Section 2.3Linear Differential Equations 49

Then x1 =z1 + z2

2= et cost and x2 =

z1 z22i

= et sint are two LI independent real solutions andthe general solution is

x = c1et cost + c2et sint.

Example 3. Solve (a) x 4x = 0 (b) 4x + 4x + x = 0 (c) x + 4x + 5x = 0.(a) Let x = et to find 2 4 = 0, = 0, 4, the general solution is x = c1e0t + c2e4t = c1 + c2e4t.(b) Let x = et to find 42 +4+1 = 0 = (2+1)2. Thus = 1/2,1/2 and the general solution

is x = c1et/2 + c2tet/2.(c). Let x = et to find 2 + 4 + 5 = 0. Thus = 2 i and the general solution is x =

e2t(c1 cos t + c2 sin t).

Finally we consider the nonhomogeneous DEax + bx + cx = f(t). (3)

Recall that the general solution of 3 is given byx = xh(t) + xp(t),

where xh(t) is the general solution of the associated homogeneous equation ax + bx + cx = 0, and xp(t)is any one (or particular) solution of 3. When f(t) is an exponential, a sinusoid, a polynomial or aproduct of these, one may use undetermined coefficients to find a particular solution. The followingexamples illustrate this.

Example 4. Solve x + 5x + 6x = 4e4t. We first solve the homogeneous equation to obtainxh = c1e2t + c2e3t.

Since the right hand side is an exponential we expect a particular solution of the form xp = Ae4t.Substituting into the DE we find

((4)2A + 5(4A) + 6A

)e4t = 4e4t.

Thus 2A = 4, A = 2, xp = 2e4t, and the general solution isx = c1e2t + c2e3t + 2e4t.

Example 5. Solve x+5x+6x = cos 2t. For a particular solution we look for a solution of the formxp = A cos 2t + B sin 2t.

Substituting into the DE we findx + 5x + 6x = (4A cos 2t 4B sin 2t) + 5 (2A sin 2t + 2B cos 2t) + 6 (A cos 2t + B sin 2t) = cos 2t.

This simplifies to(2A + 10B) cos 2t + (2B 10A) sin 2t = cos 2t.

Thus (2A + 10B) = 1 and (2B 10A) = 0. This yields A = 1/52, B = 5/52, and the general solution

x = c1e2t + c2e3t +152

cos 2t +552

sin 2t.

Exercises 2.3

Find the general solutions.

1.dx

dt 2t = e2t 2. dx

dt x

t= t2 3. 2

dx

dt+ 3x = et/2

4.dy

dx y

x= x2 + 2 5. x

dy

dx+ 2y = ex/2 6. (2 x)dy

dx+ y = 2x x2

7. x + 4x = 0 8. x + 4x + 4x = 0 9. x x = 010. 6x 5x + x = 0 11. x 2x + 5x = 0 12. x x + x = 013. x + 4x + 4x = e3t 14. 6x 5x + x = 2 sin t 15. x + 4x = cos 2t

• 50 Chapter IIDifferential Equations and the Laplace Transform

2.4The Laplace Transform

The Laplace transform of a function f(t), 0 t < , is defined to be the improper integral

L{f(t)} = f(s) =

0est f(t) dt (1)

provided this integral converges for at least one value of s. We see that functions f(t) are transformedinto new functions of s, denoted by f(s) = L(f(t)). We can consider that equation (1) defines anoperator or transformation L which transforms object functions f(t) into image functions or transformsf(s).

Example 1. Find the Laplace transform of eat.

L{eat

}=

0est eat dt

= limR

R

0e(sa)t dt

= limR

e(sa)R 1(s a)

.

If s > a, limR e(sa)R = 0, therefore

L{eat

}=

1s a, s > a. (2)

Thus the transform of the transcendental function eat is the simpler rational function1

s a . Puttinga = 0 we find

L{1} = 1s, s > 0.

The Operator L is a linear operator, that it,

L{af(t) + bg(t)} = aL{f(t)} + bL{g(t)} = af(s) + bg(s) (3)

for those values of s for which both f(s) and g(s) exist. For example L{1 + 3e2t

}= L{1}+3L

{e2t

}=

1s

+3

s + 2using equation (2).

Example 2. L{cos bt} = L{

eibt + eibt

2

}=

12

1s + ib

+12

1s ib =

s

s2 + b2,

L{sin bt} = L{

eibt eibt

2i

}=

12i

1s + ib

+12i

1s ib =

b

s2 + b2.

To each object function f(t) there exists a unique transform f(s), assuming the transform exists.In order to guarantee that to each transform there exists a unique object function, we shall restrictourselves to continuous object functions. This is the content of the following theorem.

Theorem 1. If L{f(t)} L{g(t)} and f(t) and g(t) are continuous for 0 t < , thenf(t) g(t).

If F (s) is a given transform and f(t) is a continuous function such that L{f(t)} = F (s), then noother continuous function has F (s) for its transform; in this case we shall write

f(t) = L1 {F (s)}

• Section 2.4The Laplace Transform 51

and call f(t) the inverse Laplace transform of F (s). For example if F (s) = 1/(s a) then

f(t) = L1{

1s a

}= eat. (4)

This formula (or equivalently Equation 2) is worth remembering.Corresponding to the linearity property of L we have the linearity property of L1

L1 {aF (s) + bG(s)} = aL1 {F (s)} + bL1 {G(s)}

The Laplace Transform of a function will not exist if the function grows too rapidly as t , forexample L{et2} does not exist. To describe a useful set of functions for which the transform exists, weneed the following definition.

Definition 1. f(t) is of exponential order if f(t) Met, 0 t < , where M and areconstants.

In other words a continuous function is of exponential order if it does not grow more rapidly thanan exponential as t .

Example 3. sin bt is of exponential order 0, since sin bt < 1 = 1 e0t.

Example 4. Clearly eat is of exponential order a (take M = 1, = a in the definition).

Example 5. t2 is of exponential order 1. Since, from et = 1 + t + t2/2! + , we have t2 2!et.Similarly tn, n a positive integer, is of exponential order 1.

We now show that continuous functions of exponential order have Laplace transforms.

Theorem 2. If f(t) is continuous and of exponential order , then L{f(t} exists for s > .Proof

0 e

stf(t) dt =

0 est|f(t)| dt M

0 e

stet dt M

0 e(s) dt. If s > the last

integral converges, therefore the original integral converges absolutely.

It can be shown that if f (s) = L{f(t)} exists for some value of s, say s = a, then it must also existfor s > a. The exact interval for which the transform exists is not important. The only important thingis that we know that the transform of a given function exists. Of course, to show that the transform ofa given function exists it is necessary to find at least one value of s for which it exists. However, afterthis has been done, the values of s for which the transform exists can be ignored.

Before proceeding with the development of the properties of Laplace Transforms, we indicate howthey are used to solve differential equations by means of a simple example.

Example 6. Solve DE:dy

dt 4y = et

IC: y(0) = 1We take the Laplace transform of both sides of the equation, assuming y and dy/dt possess transforms

L{

dy

dt 4y

}= L

{et}

or L{

dy

dt

} 4L{y} = 1

s 1

where we have used Equation (2) for the right hand side. Consider L{

dy

dt

}=

0 e

st dy

dtdt. Integrating

by parts we find

L{

dy

dt

}=

0est

dy

dtdt

= esty(t)|0 + s

0esty(t) dt

= limR

esRy(R) y(0) + sL{y(t)} .

• 52 Chapter IIDifferential Equations and the Laplace Transform

In Problem 4 below we show that limR esRy(R) = 0, therefore

L{

dy

dt

}= sL{y(t)} y(0) = sy(s) y(0) = sy(s) 1.

Thus the transformed differential equation now becomes

sy(s) 1 4y(s) = 1s 1 .

ory(s) =

1s 1 +

1(s 1)(s 4)

In order to find the solution y(t) which is the inverse transform of y(s), we replace the second term onthe right above with its partial fraction expansion

1(s 1)(s 4) =

13

1s 4

13

1s 1 .

Thereforey(s) =

43

1s 4

13

1s 1 .

Using Equation (4) and the linearity of the inverse transform we find

y(t) = L1 {y(s)} = 43e4t 1

3et.

By direct substitution into the DE, it can be verified that the above is the solution.

Exercises 2.4

1. Starting from the definition find L{teat}.2. If f(t) = 0, 0 t 1, and f(t) = 1, t > 1, find L{f(t)}.3. Using the method of Example 2, find L{eat cos bt}.4. If f(t) is of exponential order ( f(t) Met), show that limR esRf(R) = 0, s > .

5. If f(t) is of exponential order , show that F (t) = t

0 f() d is also of exponential order .

6. If f and g are of exponential order , show that

a. af(t) + bg(t) is also of exponential order

b. f(t)g(t) is of exponential order 2.

7. Prove that if f(t) is continuous of exponential order and y is a solution of

y + ay = f(t)

then y and y are also continuous and of exponential order and therefore possess Laplace transforms.

8. Prove that if f(t) is continuous of exponential order and y is a solution of

ay + by + cy = f(t)

then y, y, y are also continuous and of exponential order and therefore possess Laplace transforms.

9. Using Laplace transforms, solve

DE: x 5x = e3t + 4IC: x(0) = 0

• Section 2.5Properties of the Laplace Transform 53

2.5Properties of the Laplace Transform

We shall develop those properties of Laplace transforms which will permit us to find the transforms andinverse transforms of many functions and to solve linear differential equations with constant coefficients.The first theorem concerns the transform of a derivative; this is the key in the use of transforms tosolve differential equations.

Theorem 1. If f(t) and f (t) are continuous and f(t) is of exponential order, then L{f (t)}exists for s > and

L{f (t)} = sL{f(t)} f(0). (1)

Proof L{f (t)} =

0 estf (t) dt. Integrating by parts, we obtain

L{f (t)} = estf(t)|0 + s

0estf(t) dt

= limR

{esRf(R)

} f(0) + sL{f(t)}

.

Since f(t) is of exponential order, it follows (Problem 3 of Section 2.4) that limR{esRf(R)

}= 0,

thereforeL{f (t)} = sL{f(t)} f(0). (2)

By repeated applications of Theorem 1. we may find the transforms of derivatives of any order.

Theorem 2. If f, f , , . . . f (n1) are continuous and of exponential order and f (n) is continuous,then L

{f (n)(t)

}exists and

L{

f (n)(t)}

= snf(s) sn1f(0) sn2f (0) f (n1)(0). (3)

In particular for n = 2 we have

L{f (t)} = s2f(s) sf(0) f (0). (4)

Example 1. SolveDE: x + 5x + 6x = 0IC: x(0) = 1, x(0) = 0

From Problem 8 of Section 2.4 we know x, x, x all possess Laplace transforms. Taking the transformof both sides of the DE, we find

s2x s + 5(sx 1) + 6x = 0.

Solving for x, and using partial fractions, we obtain

x =s + 5

s2 + 5s + 6=

s + 5(s + 3)(s + 2)

=3

s + 2+

2s + 3

.

Taking inverse Laplace transforms we have

x(t) = 3e2t 2e3t.

Next we consider the transform of an integral.

Theorem 3. If f is continuous and of exponential order and F (t) = t

0 f() d , then F (t) is ofexponential order, L

{ t0 f() d

}exists, and

L{ t

0f() d

}=

1sf(s), s > .

• 54 Chapter IIDifferential Equations and the Laplace Transform

Proof From Problem 5 of Section 2.4 we know that F (t) is of exponential order. Since F (t) = f(t)and F (0) = 0, we have from (2), L{F (t)} = L{f(t)} = sL{F (t)}, from which the desired result follows.

Example 2. We have seen that L{1} = 1/s, thus

L(t) = L{ t

01 d

}=

1sL{1} = 1

s2.

The next theorem tells us what happens to the transform if we multiply the object function by eat.

Theorem 4. If L{f(t)} = f(s) exists for s > , then L{eatf(t)} exists for s > + a and

L{eatf(t)

}= f(s a). (5)

Proof L{eatf(t)} =

0 esteatf(t) dt =

0 e

(sa)tf(t) dt = f(s a).This is sometimes called the first shifting theorem, since multiplication of f(t) by eat has the effect

of shifting the the transform of f(s) to the right by a units to get f(s a).

Example 3. Since L{sin bt} = bs2 + b2

, we have (5) that

L{eat sin bt

}=

b

(s a)2 + b2 .

Example 4. Since L{t} = 1s2

, we have L{teat} = 1(s a)2 .

We now consider what happens to the object function when we differentiate the transform function.We have

d

dsf(s) =

d

ds

0estf(t) dt.

Assuming it is possible to differentiate under the integral sign, we obtain

f (s) =

0testf(t) dt =

0est {tf(t)} dt = L{tf(t)} .

The following theorem provides the conditions for which the above results hold.

Theorem 5. If f(t) is continuous and of exponential order , then f(s) has derivatives of allorders and for s >

f (s) = L{tf(t)}f (s) = L

{t2f(t)

}

... (6)

f (n)(s) = L{(t)nf(t)}

Example 5. Since L{eat} = 1s a , we have

L{teat

}= d

ds

1s a =

1(s a)2

L{t2eat

}=

d2

ds21

s a =2

(s a)3...

L{tneat

}= (1)n d

n

dsn1

s a =n!

(s a)n+1 (7)

• Section 2.6Partial Fractions and Use of Table 55

where n is a positive integer. In particular we may put a = 0 to get

L{tn} = n!(s a)n+1 (8)

If F (s) is a given function of s, we can ask if is the transform of some function f(t). It turns outthat there are certain conditions that must hold for F (s) to be the transform of some function. Thefollowing theorem gives us some information.

Theorem 6. If f is continuous and of exponential order , then lims

f(s) = 0. Furthermore

|sf(s)| is bounded as s .Proof Since f is of exponential order , we have for s >

|f(t)| < Meat and |estf(t)| < Me(sa)t.

Therefore|f(s)| =

0estf(t) dt

0est|f(t)| dt M

0e(sa)t dt =

M

s .

so that f(s) 0 as s . We also have

|sf(s)| sMs a 2M

if s is big enough; thus |sf(s)| is bounded.

From this theorem it follows that the functions 1, s,s

s 1 cannot be transforms of functions ofexponential order since they do not approach 0 as s approaches . Also 1/

s cannot be the transform

of a function of exponential order since s/

s =

s is unbounded as s .

Exercises 2.5

1. Using L{sin bt} = bs2 + b2

, find L{cos bt} . 2. Using L{cos bt} = ss2 + b2

, find L{eat cos bt

}.

3. Find L{t sin 3t} and L{t2 cos 3t

}. 4. Find L{teat sin bt} .

5. Find L1{

2(s 3)5

}. 6. Find L1

{4

s2 + 5

}and L1

{4s

s2 + 5

}.

7. Find L1{

2ss2 + 2s + 5

}.

2.6Partial Fractions and Use of Table

Solving differential equations using Laplace transforms involves three steps (1) taking the Laplacetransform of both sides of the equation, (2) solving for the transform of the solution, and (3) findingthe inverse transform of the result in (2) to get the solution. The first two steps are done withoutany difficulty, but the third step can be onerous and often involves a partial fraction expansion and aconsiderable amount of algebra.

It is worthwhile getting familiar with the table on Page 57. The left hand half of the page isconvenient for finding transforms, while the right hand half of the page is more convenient for findinginverse transforms. Lines 1-14 give the transforms and inverse transforms of specific functions, Lines 15-18 deal with discontinuous functions and will be studied in Section 2.9. Lines 19-20 give the transformsof eatf(t) and tnf(t), in terms of the transform of f(t). Lines 21-24 give the transform of integrals and

• 56 Chapter IIDifferential Equations and the Laplace Transform

• Section 2.6Partial Fractions and Use of Table 57

derivatives, and Line 25 gives the transform of a convolution which will be discussed in Section 2.8. Westart with a couple of simple examples of finding transforms.

Example 1. Find L{et cos 2t}.

Looking at Line 10 of the table, we see that a = 1 and b = 2, thus L{et sin 2t} = s 1(s 1)2 + 4 .

Example 2. Find L{e(t+2)

}.

Line 2 contains the transform of et, but not of e(t+2). However noting that e(t+2) = ete2, we find, usinglinearity and Line 2 that

L{

e(t+2)}

= L{e2et

}= e2L

{et}

=e2

s 1 .

We now give some simple examples illustrating the use of the table to find inverse transforms.

Example 3. Find L1{

s + 6(s + 2)3

}.

A simple trick expresses this as a some of terms that appear in the table. L1{

s + 6(s + 2)3

}=

L1{

(s + 2) + 4(s + 2)3

}= L1

{1

(s + 2)2+

4(s + 2)3

}=

te2t

1+ 4

t2e2t

2, using Line 8 of the table.

Example 4. L1{

2(s + 4)2

+3s4

}= 2L1

{1

(s + 4)3

}+ 3L1

{1s4

}= 2

t2e4t

2!+ 3

t3

3!=

t2e4t +t3

2.

The first step uses the linearity of the inverse transform. The next step uses Line 8 of the table for thefirst term and Line 2 for the second.

Example 5. Find L1{

s

s2 + 2s + 5

}. Note that the denominator does not have real factors.

Therefore we complete the square to be able to use Lines 9 or 10 in the table. We have

L1{

s

s2 + 2s + 5

}= L1

{s

(s + 1)2 + 4

}

= L1{

s + 1 1(s + 1)2 + 4

}

= L1{

s + 1(s + 1)2 + 4

} L1

{1

(s + 1)2 + 4

}

= et cos 2t et sin 2t

It is often necessary to find the inverse transform of a rational function p(s)/q(s) where p, q arepolynomials and the degree of the numerator is less than the degree of the denominator. This is doneby using partial fractions. The first step is to factor the denominator into real linear or irreduciblequadratic factors.

p(s)q(s)

=p(s)

c0(s a1)m1 (s ak)mk(s2 + b1s + c1)n1 (s2 + bls + cl)nl. (1)

The partial fraction expansion of Equation (1) consists of a sum of terms obtained as follows.

1. For each factor (s a)m in (1), we have the terms

A1s a +

A2(s a)2 +

Am(s a)m ,

• 58 Chapter IIDifferential Equations and the Laplace Transform

2. For each factor (s2 + bs + c)n in (1), we have the terms

B1 + C1ss2 + bs + c

+B2 + C2s

(s2 + bs + c)2+ Bn + Cns

(s2 + bs + c)n.

The coefficients Ai, Bi, Ci may be obtained by multiplying both sides by q(s) and equating coef-ficients of like powers of s.

Example 6. Find the partial fraction expansion ofs2 + 1

s2(s2 + 5s + 6).

First we factor the denominator

s2 + 1s2(s2 + 5s + 6)

=s2 + 1

s2(s + 3)(s + 2)

=A

s+

B

s2+

C

s + 3+

D

s + 2.

Clearing fractions

s2 + 1 = As(s + 3)(s + 2) + B(s + 3)(s + 2) + Cs2(s + 2) + Ds2(s + 3) ()= s3(A + C + D) + s2(5A + B + 2C + D) + s(6A + 5B) + 6B

Equating coefficients of like powersA + C + D = 0

5A + B + 2C + D = 16A + 5B = 0

6B = 1

Solving this system yields A = 536

, B =16, C = 10

9, D =

54. Thus

s2 + 1s2(s2 + 5s + 6)

= 536

1s

+16

1s2

109

1s + 3

+54

1s + 2

.

If the inverse transform is desired, it can now be easily obtained from the table.Perhaps a better way of obtaining the coefficients is to substitute strategically chosen values of s

into (). Putting s = 0 yields B = 16, s = 3 yields C = 10

9, and s = 2 produces D = 5

4. This

leaves only A to determine. Substituting some small value, say s = 1, we find, after a little arithmetic,

A = 536

.

Example 7. Find the inverse transform ofs + 2

(s 2)(s2 + 2)2 .

The proper form for the partial fraction expansion is

s + 2(s 2)(s2 + 2)2 =

A

s 2 +Bs + Cs2 + 2

+Cs + D(s2 + 2)2

.

Crossmultiplying, we get

s + 2 = A(s2 + 2)2 + (Bs + C)(s 2)(s2 + 2) + (Ds + E)(s 2) ()= s4(A + B) + s3(2B + C) + s2(4B + 2B 2C + D) + s(4B + 2C 2D + E) + (4A 4C 2E).

• Section 2.6Partial Fractions and Use of Table 59

Thus A+B = 0, 2B +C = 0, 4B +2B2C +D = 1, 4B +2C 2D +E = 0, 4A4C 2E = 1.Note that we can obtain A by setting s = 2 in (), yielding A = 1/9. The other coefficients are noweasily obtained. The results are B = 1/9, C = 2/9, D = 2/3, E = 1/3, and the expansion is

s + 2(s 2)(s2 + 2)2 =

19

1s 2

19

s + 2s2 + 2

13

2s + 1(s2 + 2)2

.

For the inverse transform we use Line 3 of the table for the first term, Lines 3, 4 for the second termand Lines 13, 14 for the last term. The result is

L1{

s + 2(s 2)(s2 + 2)2

}=

19e2t 1

9cos

2t 1

9

2sin

2t 1

2t sin

2t

+1

12

2(sin

2t

2 cos

2t)

Example 8. Find the inverse transform of5s + 3

(s 1)(s2 + 2s + 5) .

5s + 3(s 1)(s2 + 2s + 5) =

A

s 1 +Bs + c

s2 + 2s + 5Clearing fractions, we write

5s + 3 = A(s2 + 2s + 5) + (Bs + C)(s 1) ()= s2(A + B) + s(2A B + C) + (5A 2C)

Thus A + B = 0, 2A B + C = 5, 5A C = 3. Putting s = 1 in () we find A = 1, we then findB = 1, C = 2. The partial fraction expansion is

5s + 3(s 1)(s2 + 2s + 5) =

1s 1

s 2s2 + 2s + 5

=1

s 1 s 2

(s + 1)2 + 4

=1

s 1 s + 1 3

(s + 1)2 + 4

=1

s 1 s + 1

(s + 1)2 + 4+

3(s + 1)2 + 4

The inverse transform is

L1{

5s + 3(s 1)(s2 + 2s + 5)

}= et et cos 2t + 3

2et sin 2t.

Exercises 2.6

Find transforms of1. (t 1)2. 2. sin(2t 1). 3. t

e2t.

Find the inverse transforms of4.

1s2 3s + 2 . 5.

1s(s + 1)2

. 6.1

(s + 2)(s2 + 9).

7.3s

(2s 4)2 8.s

s2 4s + 6 9.3s

2s2 2s + 1

10.1

s(s2 4s + 13) . 11.a2

s(s2 + a2). 12.

s

(s + 2)2(s2 + 9).

• 60 Chapter IIDifferential Equations and the Laplace Transform

13. Find the correct form (do not evaluate the coefficients) for the partial fraction expansion of

3s2 5s + 2s2(s 3)3(s2 2s + 3)3 .

2.7Solution of Differential Equations

As we have seen, Laplace transforms may be used to solve initial value problems for linear differentialequations with constant coefficients. The second order case is

E: ay + by + cy = f(t)IC: y(0) = , y(0) =

where f(t) is of exponential order. The procedure consists of three steps(1) Take the Laplace transform of both sides of the equation.(2) Solve for y, the Laplace transform of the solution.(3) Find the inverse transform of y to find the solution.

We illustrate the procedure with several examples.

Example 1. Solve DE: y + 5y + 6y = e7tIC: y(0) = 0, y(0) = 2

Taking transforms, we obtain

y =2

s2 + 5s + 6+

1(s 7)(s2 + 5s + 6)

=2

(s + 3)(s + 2)+

1(s 7)(s + 3)(s + 2)

The inverse transform of the first term appears directly in the table

L1{

2(s + 3)(s + 2)

}= 2(e3t 22t) (i)

We split the second term into its partial fraction expansion

1(s 7)(s + 3)(s + 2) =

A

s 7 +B

s + 3+

C

s + 2

Clearing fractions, we find

1 = (s + 3((s + 2)A + (s 7)(s + 2)B + (s 7)(s + 3)C

Setting s = 7, we obtain A = 1/90; setting s = 3, we find B = 1/10; and setting s = 2, C = 1/9.Therefore

1(s 7)(s + 3)(s + 2) =

190

1s 7 +

110

1s + 3

19

1s + 2

andL1

{1

(s 7)(s + 3)(s + 2)

}=

190

e7t +110

e3t 19e2t (ii)

The solution y(t) is the sum of (i) and (ii)

y(t) =190

e7t +110

e3t 19e2t

• Section 2.7Solution of Differential Equations 61

Example 2. Solve DE: y + y = 3IC: y(0) = 1, y(0) = 2

Taking transforms we have (s2y s 2) + y = 3/s. Solving for y yields

y =3

s(s2 + 1)+

s + 2s2 + 1

Splitting the first term into partial fractions produces

3s(s2 + 1)

=A

s+

Bs + Cs2 + 1

,

or3 = A(s2 + 1) + (Bs + C)s.

Equating coefficients of s on both sides, we find A = 3, B = 3, C = 0. Therefore

y =3s 2s

s2 + 1+

2s2 + 1

.

From the table we findy(t) = 2 2 cos t + 2 sin t.

Example 3. Solve DE: y + y = et sin 2tIC: y(0) = 1, y(0) = 3.

(s2y s 3) + y = 2(s 1)2 + 4 ,

y =s + 3s2 + 1

+2

(s2 + 1)[(s 1)2 + 4] .

We must find the partial fraction expansion for the second term.

2(s2 + 1)[(s 1)2 + 4] =

As + Bs2 + 1

+Cs + D

(s 1)2 + 4 ,

or2 = (As + B)[(s 1)2 + 4] + (Cs + D)(s2 + 1).

Equating coefficients of s on both sides, we find A = 1/5, B = 2/5, C = 1/5, D = 0. Therefore

y =65

s

s2 + 1+

175

1s2 + 1

15

(s 1) + 1(s 1)2 + 4 .

From the table we obtain the solution

y(t) =65

cos t +175

sin t 110

et sin 2t 15et cos 2t.

Exercises 2.7

Solve using Laplace Transforms.1. y + 3y + 2y = t, y(0) = 1, y(0) = 1. 2. y + 4y + 4y = e2t, y(0) = y(0) = 0.3. 2y 2y + y = 0, y(0) = 0, y(0) = 1 4. y + 5y + 6y = e2t, y(0) = 1, y(0) = 05. y + 2y + 2y = sin t, y(0) = 0, y(0) = 1. 6. y y = eat, y(0) = 1, for a (= 1 and a = 1.7. y + y = 0, y(0) = 1, y(0) = y(0) = 0. 8. yiv y = 0, y(0) = y(0) = y(0) = 0, y(0) = 1.9. y 4y + 5y = e2t cos t, y(0) = y(0) = 1. 10. Find the general solution of y + y = t.

• x

t

t = x

62 Chapter IIDifferential Equations and the Laplace Transform

11. Solve the harmonic oscillator equation mx + kx = F (t), x(0) = x0, x(0) = v0 for the casesa. F (t) 0 b. F (t) 1 c. F (t) = F0 cost, (=

k/m

d. F (t) = F0 sint, =

k/m.

2.8Product of Transforms; Convolutions

When Laplace transforms are used to solve differential equations it is often necessary to find the inversetransform of the product of two transform functions f(s)g(s). If f(s) and g(s) are both rationalfunctions, the inverse transform may be found by the method of partial fractions. However, it is usefulto know whether or not the product of transform functions is itself a transform function and if so, whatit the inverse transform.

Let f(t) and g(t) be continuous functions that possess the transforms f(s) and g(s), respectively.Let h(t) be a continuous function such that

L{h(t)} = f(s)g(s) =

0esuf(u) du

0esxg(x) dx. (1)

if such a function exists. We assume that the right hand side can be written as an iterated integral

L{h(t)} =

0

[

0es(u+x)f(u)g(x) du

]dx.

Making the transformation u + x = t, x = x, we obtain

L{h(t)} =

0

[

xestf(t x)g(x) dt

]dx.

Figure 1

Assuming it is possible to change the order of integration, we find (see Figure 1)

L{h(t)} =

0est

[ t

0f(t x)g(x) dx

]dt.

From the uniqueness theorem we obtain

h(t) = t

0f(t x)g(x) dx.

• Section 2.8Product of Transforms; Convolutions 63

The expression on the right is called the convolution of f(t) and g(t) and is symbolized by f(t) g(t),that is,

f(t) g(t) = t

0f(t x)g(x) dx.

Thus we have that the muliplication of transform functions f(s)g(s) corresponds to the convolution ofthe object functions f(t) g(t).

The above derivation was purely formal. However, in more advanced treatments, the followingtheorem can be proven.

Theorem 1. If f and g are continuous and of exponential order, then f(t) g(t) = t0 f(t x)g(x) dx exists and

L{f(t) g(t)} = L{f(t)}L {g(t)} = f(s)g(s). (2)or in terms of inverse transforms

L1{

f(s)g(s)}

= f(t) g(t) = t

0f(t x)g(x) dx. (3)

Example 1. Verify Equation (2) for f(t) = sin t, g(t) = 1.According to Equation (2) we have

L{sin t t} = L{sin t} L {t} = 1s2 + 1

1s.

Now (sin t) t = t

0 sin(t x) 1 dx = 1 cos t. Thus, we also have

L{sin t t} = L{1 cos t} = 1s s

s2 + 1=

1s(s2 + 1)

.

The inverse form of the convolution given in Equation (3) is very helpful in finding inverse trans-formations. The following two examples show that it may be simpler than partial fractions.

Example 2. Find L1{

1s(s2 + a2)

}. We have

L1{

1s(s2 + a2)

}= L1

{1s 1(s2 + a2)

}

= 1 (

1a

sin at)

= t

0(1)

(1a

sin ax)

dx

=1a2

(1 cos at)

Example 3. Find L1{

1(s2 + a2)2

}.

We have

L1{

1(s2 + a2)2

}= L1

{1

(s2 + a2) 1(s2 + a2)

}

=1

a sin at 1

a sin at

= t

01/a sin a(t x) 1/a sin ax dx

=1

2a3(sin at at cos at).

• 64 Chapter IIDifferential Equations and the Laplace Transform

The notation f g for convolution suggests that convolution can be thought of as a new kind ofmultiplication of functions. In fact, convolutions has some properties similar to ordinary multiplication:

f g = g f , commutative law(f g) h) = f (g h), associative law(f (g + h) = f g + f h, distributive law(cf) g = c(f g) = c(g f), c=constantwhere f, g, h are assumed to possess transforms. These properties are easily proven using Equation

2. In particular the commuative law above states that f g can be written in either of the equivalentforms

f g = t

0f(t x)g(x) dx =

t

0g(t x)f(x) dx

Convolution is really necessary when we deal with differential equations with arbitrary forcingterms.

Example 4. DE: y + y = f(t)IC: y(0) = y(0) = 0.

where f(t) is assumed to have a Laplace transform. Taking transforms we have

(s2 + 1)y(s) = f(s), or, y(s) =1

s2 + 1f(s).

Therefore

y = sin t f(t) = t

0sin(t x)f(x) dx,

or, since convolution is commutative, we have the alternative form of the solution

y = sin t f(t) = t

0f(t x) sin(x) dx.

Finally we illustrate how to solve certain special types of integral equations where the unknownfunction is under the integral sign.

Example 5. Solve y(t) = t2 + t

0 y() sin(t ) d .Since the term with the integral is a convolution we may rewrite this equation as

y(t) = t2 + y(t) sin t.

Assuming y has a transform, we take the transform of both sides to get

y(s) =2s3

+ y(s)1

s2 + 1, or, y(s) =

2s3

+2s5

,

thereforey(t) = t2 +

112

t4.

To see that this is the solution, we substitute into the integral equation and find that, indeed, it doessatisfy the equation.

Exercises 2.8

1. Using convolutions find:

a. L1{

1s(s 2)

}b. L1

{1

s2(s 2)

}

c. L1{

s2

(s2 + 4)2

}d. L1

{1

s2(s2 + 2)

}

• t

f (t)

Section 2.9Discontinuous Forcing Functions 65

2. Solve DE: y y = f(t)IC: y(0) = y(0) = 0

3. Solve the folowing integral equationsa f(t) = 1 +

t0 f() sin(t ) d b. f(t) = sin t +

t0 f() cos(t ) d

2.9Discontinuous Forcing Functions

In the preceding sections we assumed, for simplicity, that the functions considered were continuous in0 t < . There is no particular difficulty in taking Laplace transforms of functions with finite or jumpdiscontinuities, and the transform method is very efficient is solving linear differential equations witha discontinuous forcing term. Such discontinuous forcing terms appear often in practical applications.The simple matter of throwing a switch in a electrical circuit causes the voltage to jump, practicallyinstantaneously, from one value to another. We shall generalize our treatment of transforms to piecewisecontinuous functions defined below.

Figure 1

Definition 1. f(t) is said to be piecewise continuous on every finite interval, if, for every A, f(t)is continuous on 0 t A except for a finite number of points ti, at which f(t) posseses a right- andleft-hand limit. The difference between the right-hand and left-hand limits, f(ti +0)f(ti0) is calledthe jump of f(t) at t = ti.

A graph of a typical piecewise continuous function is shown in Figure 1. A very simple and veryuseful discontinuous function is the Heaviside function or unit step function shown in Figure 2 anddefined by

H(t t0) ={

0, t < t01, t > t0

. (1)

This function is continuous except at t = t0 where it has a jump of 1. It is easy to compute theLaplace transform of H(t t0)

L{H(t t0)} =

0H(t t0)est dt =

t0

est dt =est0

s.

We can use the Heaviside function to express discontinuous functions is a convenient form. Considerthe function defined by different analytic expressions in different intervals, as shown in Figure 3, i.e.,

F (t) ={

f(t), t < t0g(t), t > t0

• t

1

0H (t - t )

t 0

tt 0

f(t)

g(t)

F(t)

66 Chapter IIDifferential Equations and the Laplace Transform

Figure 2

Figure 3

This can be expressed in terms of the Heaviside function as follows

F (t) = f(t) + (g(t) f(t))H(t t0). (2)

Equation (2) can be thought of as starting with f(t) and then, at t = t0, switching on g(t) and switchingoff f(t), or jumping to g(t) from f(t). This can be extended to any number of jumps. For example

G(t) =

f(t), t < t0g(t), t0 < t < t1h(t), t > t1

(3)

can be expressed as

G(t) = f(t) + (g(t) f(t))H(t t0) + (h(t) g(t))H(t t1). (4)

Example 1. Write the function in Figure 4 in terms of Heaviside functions.First we write the function in the usual way

f(t) =

{ t, 0 t 1t + 2, 1 t 20, t 2

Now, using the method illustrated in Equation (4), we have

f(t) = t + (t + 2 t)H(t 1) + (0 (t + 2))H(t 2)= t + (2 2t)H(t 1) + (t 2)H(t 2)

.

• t

f(t)

1 2

1

Section 2.9Discontinuous Forcing Functions 67

Figure 4

To find the Laplace transform of a function such as those defined in equations (2) or (4), we need tofind the Laplace transform of a function of the form f(t)H(t t0). The answer is given in the followingtheorem.

Theorem 1. If L{f(t)} exists then L{f(t)H(t t0)} exists and is given by

L{f(t)H(t t0)} = et0sL{f(t + t0)} . (5)

Proof L{f(t)H(t t0)} =

0 estf(t)H(t t0) dt =

t0

estf(t) dt. Letting t = t0 + , we find0 e

s(+t0)f( + t0) d = est0

0 esf( + t0) d = est0L{f(t + t0)}.

Example 2. Find the Laplace transform of

f(t) ={

t, 0 < t < 21, t > 2 .

In terms of the Heaviside function, we have

f(t) = t + (1 t)H(t 2)

and according to Equation (5)

L{f(t)} = 1s

+ e2sL{1 (t + 2)} = 1s

+ e2sL{1 t} = 1s e2s

(1s

+1s2

).

In taking Laplace transforms, functions need to be defined for 0 t < . However, for thepurposes of this next theorem, we assume that f(t) is defined for all t, but f(t) = 0 for t < 0. Considerthe function given by

fs(t) = f(t t0)H(t t0) ={

f(t t0), t > t00, t < t0

The function fs(t) is the function f(t) shifted to the right by an amount t0 as shown in Figure 5.The following theorem tells us how to find the transform of the shifted function fs(t).

Theorem 2. If f(s) = L{f(t)} exists, then L{fs(t)} exists and is given by

L{fs(t)} = L{f(t t0)H(t t0)} = est0 f(s). (6)

Proof

L{f(t t0)H(t t0)} =

0estf(t t0)H(t t0) dt

=

t0

estf(t t0) dt.

• t

f(t)

t

f(t - t ) H(t - t )

t 0

00

68 Chapter IIDifferential Equations and the Laplace Transform

Figure 5

Make the substitution t t0 = to find

L{f(t t0)H(t t0)} =

0es(+t0)f() d

= est0

0esf() d

= est0 f(s).

The above theorem is be particularly useful in finding inverse transforms of functions involvingeas.

Example 3. Find the inverse transform ofe2s

s4. Let f(s) = 1/s4, then from Line 2 of the table,

f(t) =t3

6. Thus, according to Equation (6), we have

L1{

e2s

s3

}=

(t 2)3

6H(t 2).

We now consider a differential equation with a discontinuous forcing term

DE: ay + by + cy = f(t)IC: y(0) = , y(0) =

For simplicity assume that f(t) is continuous except for a jump discontinuity at t = t0. First of all wemust consider what we mean by a solution, since it is clear from the DE that y(t) will not exist at t0.However, a unique solution exists in the interval 0 t t0; therefore the left hand limits y(t0), andy(t0) exist. Using these as new initial values at t = t0, a unique solution can be found for t t0. Wedefined the function obtained by this piecing-together process to be the solution. Note that y and yare continuous, even at t0, whereas y has a jump discontinuity at t0.

Example 4.DE: y + 5y + 6y = f(t)IC: y(0) = y(0) = 0

where f(t) = 1, t < 2, f(t) = 1, t > 2. Solving y + 5y + 6y = 1, y(0) = y(0) = 0 for 0 t 2 wefind

y =16 1

2e2t +

13e3t, 0 t 2. (i)

• Section 2.9Discontinuous Forcing Functions 69

We now solve y + 5y + 6y = 1, y(2) = 16 1

2e4 +

13e6, y(2) = e4 e6 for t 2. The general

solution of the DE is y = c1e2t + c2e3t 16. Satisfying the initial conditions at t = 2 we find

y =12e2t(e4 1) + 1

3e2t(1 e6), t 2 (ii).

The function defined by (i) and (ii) is the solution. Note that y andy are conitnuous at t = 2, whereasy(2+) y(2) = 2.

The Laplace transform is an efficient tool for handling differential equations with a discontinuousforcing term f(t), provided that f(t) posseses a transform. It is not difficult to show that if f(t) ispiecewise continuous and of exponential order, then it has a Laplace transform. Also the formulas forthe transform of derivatives still hold if y, y, . . . y(n1) are continuous and of exponential order buty(n) is piecewise continuous. Using these facts let us redo Example 4 using transforms.

Example 5. We write the right hand side in terms of Heaviside functions: f(t) = 1+(11)H(t2) = 1 2H(t 2). Our problem can now be written as

y + 5y + 6y = 1 2H(t 2), y(0) = y(0) = 0.

Taking transforms, we obtain

(s2 + 5s + 6)y =1s 2e

2s

s

y =1

s(s2 + 5s + 6) 2e

2s

s(s2 + 5s + 6).

We find

L1{

1s(s2 + 5s + 6)

}= L1

{1

s(s + 3)(s + 2)

}=

16 1

2e2t +

13e3t g(t).

Therefore, using Equation (6)

L1{

e2s

s(s2 + 5s + 6)

}= g(t 2)H(t 2),

and the solution is

y =16 1

2e3t +

13e3t +

(16 1

2e2(t2)t +

13e3(t2)

)H(t 2),

which agrees with the solution obtained in Example 4.

Exercises 2.9

For problems 1-4, write in terms of Heaviside functions, sketch, and find the Laplace transform

1. f(t) ={

t, 0 t < 51, t > 5 2. f(t) =

{0, 0 t < cos t, t >

3. f(t) =

{ 0, 0 t < 12, 1 < t < 20, t > 2

4. f(t) =

{ 1, 0 t < 31, 3 < t < 5t 2, t > 5

• t

f(t)

t

f(t)

1 2

1

1

1

2

( a ) ( b )

inputSystem

output

70 Chapter IIDifferential Equations and the Laplace Transform

5. Write in the form (3) and sketch.a. f(t) = t + (1 t)H(t 2) + t2H(t 3). b. f(t) = H(t) + H(t 1) H(t 2).

6. Find L{t2H(t 1)

}. 7. Find L{etH(t 2)} 8. Find L{sin tH(t 2)} .

9. Find L1{

1 e3s

s2

}. 10. Find L1

{ses

s2 + 4

}. 11. Find L1

{eas

s2 1

}.

12. Find the transforms of the functions in the following diagrams

13. Solve y 3y + 2y = f(t), y(0) = y(0) = 0 where

f(t) =

{ 0, 0 t < 12, 1 < t < 20, t > 2

14. Solve y + y = f(t), y(0) = y(0) = 0 where

f(t) ={

1, 0 t < 10, t > 1

2.10The Weighting Function and Impulse Functions

Consider a physical device or system that is governed by an nth order linear differential equation withconstant coefficients

DE: a0y(n)(t) + a1y(n1)(t) + + any(t) = f(t), t 0IC: y(0) = b0, y(0) = b1, . . . , y(n1)(0) = bn1.

(1)

where the ai and bi are given constants, a0 (= 0, and f(t) is a given continuous function. This differentialequation could be the model of an electric circuit, a mechanical massspring system, or an economicsystem. The function f(t) can be thought of as the input or excitation to the system and y(t) as theoutput or response of the system. We indicate this by the simple block diagram shown in Figure 1. In amechanical system the input could be an external force, and the output a displacement; in an electricalsystem the input could be a voltage, and the output the current in some branch of the circuit.

Figure 1

• Section 2.10The Weighting Function and Impulse Functions 71

The output of the system for a given inital conditions can conveniently be obtained by Laplacetransforms. For simplicity we consider here the second order case

DE: ay + by + cy = f(t)IC: y(0) = , y(0) =

(2)

where the system parameters, a (= 0, b, c, are constants, and the input f(t) is a continuous function ofexponential order. Taking transforms of the DE, we find

a{s2y(s) s

}+ b {sy(s) } + cy(s) = f(s),

or,(as2 + bs + c)y(s) = f(s) + (as + b) + a. (3)

Lettingp(s) = as2 + bs + c, and q(s) = (as + b) + a (4)

we can rewrite (3) asp(s)y(s) = f(s) + q(s). (5)

We note that p(s) is the characteristic polynomial of the differential equation and q(s) is a polynomial ofdegree 1 which depends on and , that is, on the initial conditions. If the system is in the zero-initialstate ( = = 0), then q(s) 0.

From (3) we find the transform y(s) is

y(s) =f(s)p(s)

+q(s)p(s)

(6)

It is customary to define the transfer function Y (s) of the system by

Y (s) =1

p(s)=

1as2 + bs + c

.

Thus (6) becomesy(s) = Y (s)f(s) + Y (s)q(s).

If we definey0(t) = L1

{Y (s)f(s)

}, (7)

y1(t) = L1 {Y (s)q(s)} , (8)

then the output y(t) is given byy(t) = y0(t) + y1(t)

It is easy to see that y0(t) is a particular solution of (2) and y1(t) is a solution of the correspondinghomogeneous DE. The function y0(t) is the output due to the input f(t) when the initial state is thezero state we call y0(t) the zero-state response or the response due to the input. The function y1(t) isthe response of the system if the input f(t) 0 and is called the zero-input response or the responsedue to the intial state.

The zero-state response and the weighting function

The zero-state reponse is given by

y0(t) = L1{

Y (s)f(s)}

.

• ( a ) ( b ) ( c )

1 2 2 4 6 8

72 Chapter IIDifferential Equations and the Laplace Transform

We know that Y (s) is a rational function. If f(s) is also a rational function, we could find the zero stateresponse by partial fractions. Even if f(s) is not a rational function, we may proceed using convolutions.For this purpose we define the weighting function by

w(t) = L1 {Y (s)} .

Since Y (s) is a proper rational function the weighting function can be found by partial fractions. Thezero-state response can now be given by any of the forms

y0(t) = w(t) f(t) (9)

y0(t) = t

0w()f(t )d (10)

y0(t) = t

0w(t )f()d (11)

From (10) we see that the zero-state response at time t is the weighted integral of the input the input units in the past, namely, f(t ), is weighted by w(). Therefore knowledge of the weighting functioncompletely determines the zero-state response for a given input.

Figure 2

A graph of w() gives useful information about the response of the system. For instance, if w() isas in Figure 2(a), the weighting function is almost zero for 2; therefore the values of the input morethan two time units in the past do not appreciably affect the output. We say the system remembersthe input for about two time units or that the system has a memory of about two time units. Theweighting function of Figure 2(b) has a memory of about eight units; the function in Figure 2(c) has aninfinite memory. In general, we do not expect reasonable practical systems to have weighting functionslike that in Figure 2(c); rather, we expect that inputs far in the past do not appreciable affect theoutput.

Example 1. Consider the system governed by the differential equation

y(t) + 5y(t) + 6y(t) = f(t),

with zero initial conditions. The characteristic polynomial is

p(s) = s2 + 5s + 6 = (s + 3)(s + 2).

• Section 2.10The Weighting Function and Impulse Functions 73

The transfer function is

Y (s) =1

p(s)=

1(s + 3)(s + 2)

=1

s + 2 1

s + 3.

The weighting function isw(t) = L1 {Y (s)} = e2t e3t.

Therefore the zero-state response is

y0(t) = t

0

(e2 e3

)f(t )d.

The zero state response for a given input can be obtained from the above; for instance, if f(t) = H(t),the unit step function, then

y0(t) = t

0

(e2 e3

)d =

e3t

3 e

2t

2+

16.

The graph of w() is close to that shown in Figure 2(a). Thus the system has a memory of about2 units, and inputs far in the past have little effect on the output.

The zero-input response

The zero-input response is given by Equations (8) and (4)

y1(t) = L1 {Y (s)q(s)}= L1 {Y (s) ((as + b) + a)}= L1 {Y (s) (b + a)} + L1 {asY (s)} (12)

We know that L1 {Y (s)} = w(t), therefore, for the first term in (12) we have

L1 {Y (s) (b + a)} = (b + a) w(t).

Recall from Section 2.5, Theorem 1, that L{w(t)} = sL{w(t)} w(0) = sY (s) w(0). In Problem 5below we show that w(0) = 0, thus L1 {sY (s)} = w(t). The zero-input response is now easily obtained

y1(t) = (b + a) w(t) + aw(t). (13)

Note that the zero-input response is given entirely in terms of the weighting function w(t) and itsderivative. If we pick the particular initial conditions, y(0) = = 0, and y(0) = = 1/a, we find fromEquation (13) that y(t) = w(t). Thus the weighting function can be characterized as the zero-inputresponse due to the initial conditions

w(0) = 0, and, w(0) = 1/a.

Example 2. Consider the system of Example 1. where the system parameters are a = 1, b =5, c = 6. The weighting function is w(t) = e2t e3t. We find that w(0) = 0 and w(0) = 1. Sincea = 1, this is in accordance with (13). If the initial conditions are

y(0) = 4, and y(0) = 2.

• !"( t )

t"

"1

74 Chapter IIDifferential Equations and the Laplace Transform

Putting = 4 and = 2 into Equation (13) we obtain the zero-input response

y1(t) = 18w(t) + 4w(t) = 10e2t + 6e3t.

The weighting function and the unit implulse response

If we try to find an input f(t) such that the zero-state response in the weighting function, we areled to the equation

w(t) = w(t) f(t),

or, taking transformsY (s) = Y (s)f(s).

Therefore f(s) 1; this is impossible (see Page 55). However, let us consider an input f(t) whoseLaplace transform is close to 1. Such an input is the function )(t) defined by

)(t) =

1), 0 t < )

0, t > )

and is shown in Figure 3.

Figure 3

If ) is small )(t) represents a function that is very large for a small time interval but the areaunder )(t)=1, i.e.

0 )(t) dt = 1. The limit of )(t) as ) 0 is often called a unit impulse function or

delta function, and denoted by (t). However (t) is not a function in the usual sense since, accordingto the defintion (0) would have to be infinite. For the momemt, let us work with )(t). It is easy tocalculate the Laplace transform of )(t)

L{)(t)} =

0est)(t) dt =

1e

)

0est dt =

1 es)

s).

Furthermore, we find that as ) 0 we have

lim)0

(L{)(t)}) = lim)0

1 es)

s)= 1,

• Section 2.10The Weighting Function and Impulse Functions 75

where, for example, the limit can be calculated by lHospitals Rule. Therefore for small ), L{)(t)} isclose to 1. Now let w)(t) be the zero-state response to the input )(t). We have

w)(t) = w(t) )(t) t

0w(t ))() d

=1e

)

0w(t ) d

It is easy to show thatlim)0

w)(t) = w(t).

Therefore the weighting function is the limit of the zero-state response to the input )(t) as ) 0.Briefly, but less precisely, we say that the weighting function is the zero-state response to a unit impluse.

Think for a moment of a mass-spring system which is sitting at rest. If we give the mass a suddenblow with a hammer, we will impart a force to it which looks roughly like that in Figure 3, that is, theforce will be large for a short period of time and zero thereafter. If the area under the force-time curveis equal to A, then the force is approximately A)(t). The displacement of the system for t 0 will beapproximately Aw(t). This is an experimental way of finding the weighting function for a mechanicalsystem. Once the weighting function is found, the response to any input is determined, as we have seenabove.

Unit impulse functions or delta functions

To treat impulse functions in a careful manner requires more advanced mathematical conceptsthat we suppose here. However, it is rather easy to use Laplace transforms to solve systems such as (2)where the forcing function is an impulse function. We think of the unit impulse, (t), as a generalizedfunction. We define operation on (t) by means of appropriate limit operations on )(t). The followingare easy to establish.

L{(t)} lim)0

L{)(t)} = 1.

L{(t t0)} lim)0

L{)(t t0)} = est0 .

(t t0)f(t) dt lim

)0

)(t t0)f(t) dt = f(t0).

The last property is known as sifting property of (t).

Example 3. Solve y + 4y = (t), y(0) = y(0) = 0.Taking the transform of both sides we find

s2y + 4y = 1, or y =1

s2 + 1.

y(t) = L1{

1s2 + 1

}= sin t.

Note that y(t) is just the weighting function.

Example 4. Solve y + 4y = (t 2), y(0) = 0, y(0) = 1.Taking transforms we find

s2y 1 + y = e2s, y = 1s2 + 1

+e2s

s2 + 1.

• 76 Chapter IIDifferential Equations and the Laplace Transform

Thereforey(t) = sin t + H(t 2) sin(t 2).

Exercises 2.10

In problems 13 find the weighting function and use it to find the zero-state response and thezero-input response.

1. y + y = f(t), y(0) = y(0) = 1. 2. y y = f(t), y(0) = 0, y(0) = 1.3. y + y 2y = f(t), y(0) = 1, y(0) = 0.4. Show that the weighting function w(t) has the property that w(0) = 0.

Hint: Write Y (s) =1

as2 + bs + c=

1s

s

as2 + bs + cand note that

L1{

s

as2 + bs + c

} h(t) is some function that can be obtained by partial fractions.

Solve the following using properties of impulse functions.5. y 4y = (t 1) + (t 2), y(0) = y(0) = 0. 6. y + 4y + 4y = (t 2), y(0) = 0, y(0) = 1.7. y y = (t 2), y(0) = 5. 8. y y = (t) + H(t 2), y(0) = y(0) = 0.9. y(iv) y = (t 2), y(0) = 1, y(0) = y(0) = y(0) = 0.

• CHAPTER III

MATRICES AND SYSTEMS OF EQUATIONS

3.0Introduction

The first objective of this chapter is to develop a systematic method for solving systems of linearalgebraic equations.

Consider the system of two linear equations in the three unknowns x, y, and z

x 2y + 3z = 52x 4y + 7z = 12. (1)

By a solution of the system (1) we mean an ordered triple of numbers, written as the array [x, y, z],which satisfy both equations in (1). Thus [1, 0,2] is a solution while [5, 0, 0] is not; the latter triplesatisfies only the first equation. We consider the triple of numbers as a single entity, called a vectorwhich we denote by a single letter w.

w = [x, y, z]. (2)

The numbers x, y, z are called the first, second and third components of w, respectively. When thecomponents are written in a horizontal row as in (2), the vector is called a row vector . We could equallyas well have written the components in a vertical column

v =

xyz

(3)

which we call a column vector . We shall adopt the usual convention in matrix theory and write solutionvectors as column vectors as in (3). If we have a system of equations with n unknowns the solutionvector has n components. We shall study the algebra of nvectors in the next section.

Let us start by considering three examples of systems of two equations in two unknowns. Althoughthe examples are extremely simple they illustrate the various situations that can arise.

Example 1. Consider the systemx + y = 5

2x y = 4 . (4)

We shall use the method of elimination to find all possible solution vectors v =[

xy

]. Suppose the

components, x and y, satisfy the system (4). By multiplying the first equation by 2 and adding tothe second equation we get a new second equation, 3y = 6, where x has been eliminated. Thus anysolution of the system (4) is also a solution of

x + y = 5 3y = 6 . (5)

Conversely, if x and y satisfy (5), we may multiply the first equation by 2 and add it to the secondequation to retrieve the original second equation, 2x y = 4. Therefore the systems (4) and (5) havethe same solution vectors. However the system (5) is easy to solve. The second equation yields y = 2which may be back-substituted into the first equation to get x = 3. The system (5) and thus the system(4) has a unique solution vector, namely

v =[

32

].

• xx x

y y y

(a) (b) (c)

x + y = 5x + y = 5x + 2y = 2

x + y = 4

2x - y = 4

78 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

Figure 1

Geometrically, each equation in the system (4) represents a straight line as shown in Figure 1(a). Sincethe lines are not parallel, they intersect in a unique point, the point (3, 2).

Example 2.x + 2y = 2

2x + 4y = 4. (6)

Multiplying the first equation by 2 and adding to the second equation produces

x + 2y = 20 x + 0 y = 0 . (7)

The second equation in (7) puts no restriction on x and y, thus it is only necessary to satisfy the firstequation to produce a solution. This fact is already obvious in system (6) since the second equation issimply twice the first. We may assign any value to y, say y = t, and then x = 2 2t. Thus, there areinfinitely many solutions of (6), given by

v =[

2 2tt

], (8)

where t is arbitrary. Geometrically, both equations in (6) represent the same straight line as shown inFigure 1(b). Any point on this straight line is a solution. The vectors given in (8), in component formare x = 2 2t and y = t; these are just the parametric equations of the line x + 2y = 2.

Example 3.x + y = 5x + y = 4 (9)

Clearly it is impossible for two numbers x and y to add up to 4 and 5 at the same time. The system (9)has no solutions; we call such a system inconsistent . If we try to eliminate x from the second equationwe obtain

x + y = 50 x + 0 y = 2 (10)

The inconsistency clearly shows up in the second equation in (10). Geometrically, the two equationsin (9) represent two parallel lines as seen in Figure 1(c). These lines never intersect, and the equationshave no solution.

We see that a system of two equations in two unknowns may possess no solutions, exactly onesolution or infinitely many solutions. We shall see later that the same three situations may occur for nequations in n unknowns.

• Section 3.0Introduction 79

A general system of m equations in n unknowns may be written

a11x1 + a12x2 + + a1nxn = b1a21x1 + a22x2 + + a2nxn = b2

......

......

am1x1 + am2x2 + + amnxn = bm .

(11)

The coefficients aij and the right hand sides bi are assumed to be given real or complex numbers, and thexi are the unknowns. In this double subscript notation for the coefficients, aij stands for the coefficient,in the ith equation, of the jth unknown, xj . Using summation notation the system (11) may be written

n

j=1

aijxj = bi, i = 1, 2, . . . , m. (12)

By a solution of the system we mean the nvector

v =

x1x2...

xn

whose components xi satisfy all of the equations of the system (12).If the system (12) has at least one solution vector it is said to be consistent ; if no solution vector

exists it is said to be inconsistent . By systematically exploiting the method of elimination we shallshortly develop a procedure for determining whether or not a system is consistent and, if consistent, tofind all solution vectors.

If in the system (12) all of the right hand sides bi = 0, the system is called homogeneous, otherwiseit is called x . A homogeneous system

n

j=1

aijxj = 0, ; i = 1, 2, . . . , m.

always has a solution, namely, x1 = 0, x2 = 0, , xn = 0; this is called the trivial solution. Thushomogeneous equations are always consistent. The fundamental questions concerning homogeneousequations are whether or not nontrivial solutions exists, and how to find these solutions.

Exercises 3.0

For the systems in problems 1 and 2, find all solutions and write the solutions in vector form.1. a. 2x 3y = 2

x 2y = 7b. 2x + 2y = 5

4x + 4y = 7c. x + y = 0

x y = 0d. 2x 2y = 4

x + y = 22. a. x 2y + 3z = 5

2x 4y + 7z = 1b. x 2y + 3z = 0

2x 4y + 7z = 03. What are the restrictions on the values of b1 and b2, if any, for the following systems to be consistent:

a. 2x + 3y = b1x 3y = b2

b. 2x 3y = b12x + 3y = b2

4. Consider one equation in one unknown, ax = b. Discuss the existence and uniqueness of solutions.Consider three cases (i) a "= 0, (ii) a = 0, b = 0 and (iii) a = 0, b "= 0.

• 80 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

3.1The Algebra of nvectors

The solution of a system of equations with n unknowns is an ordered n-tuple of real or complex numbers.The set of all ordered n-tuples of complex numbers we denote by Cn. If x Cn we call x a n-vector orsimply a vector and write

x =

x1x2

...xn

(1)

where xi is called the ith component of x. The algebra of n-vectors is a generalization of the algebra ofvectors with two or three components with which the reader is undoubtedly familiar.

Definition 1. Let x and y be vectors in Cn and let be a scalar (a complex number), we definemultiplication by a scalar, x, and the sum of two vectors, x + y, by

x =

x1x2

...xn

=

x1x2

...xn

, x + y =

x1x2

...xn

+

y1y2...

yn

=

x1 + y1x2 + y2

...xn + yn

(2)

The difference of two vectors is defined by x y = x + (1)y.Example 1.

12

35

3

10

20

=

42

95

,

1 + ii3

2 + 5i

=

103

2

+ i

11

05

The zero vector is the vector with all of its components equal to zero; we denote it by 0. Notethat 0x = 0 for all x Cn. It is also worthwhile noting that if x, y Cn and x = y then xi = yi, i =1, 2, . . . , n. Thus one vector equation is equivalent to n scalar equations.

The properties of addition and multiplication by a scalar are given by

Theorem 1. Let x, y, z Cn and , be scalars, then the following hold1. x + (y + z) = (x + y) + z (associative law of multiplication)2. x + y = y + x (commutative law of addition)3. x + 0 = x4. x x = 05. ()x = (x)6. (x + y) = x + y7. ( + )x = x + x

The proof of this theorem follows immediately from Definition 1 and the properties of complexnumbers.

Sometimes we may wish to restrict ourselves the components of vectors and the scalar multipliersto be real numbers. In this case we denote the set of n-tuples of real numbers by Rn. Definition 1 andTheorem 1 still apply in this case.

Example 2. Find all real solutions of the single linear equation in two unknowns

x + 2y = 3 (i)

It is clear that we may take y to be an arbitrary real number, say y = t, and then x = 3 2t, so thatthe general solution is given in vector form by

v =[

xy

]=

[3 2t

t

]=

[30

]+ t

[2

1

]

• 30

-2 1

t -2 1

x

y

Section 3.1The Algebra of nvectors 81

Figure 1

where t is arbitrary. The solution is the sum of a fixed vector,[

30

]and a multiple of

[2

1

]. Geomet-

rically the tips of the solution vectors lie on the line x + 2y = 3 as shown in Figure 1.

Note that[

30

]is a particular solution of (i), and t

[2

1

]is the general solution of the associated

homogeneous equation x + 2y = 0. This is an instance of a general theorem we will consider later.Sums of multiples of vectors will occur often in our work. It is convenient to give such expressions

a name.

Definition 2. Let{v1, v2, . . . , vk

}be a set of vectors in Cn (or Rn), and let 1, 2, . . . , k be

scalars. The expression1v1 + 2v2 + + kvk

is called a linear combination of v1, v2, . . . , vk with weights 1, 2, . . . , k.

Example 3. Let x =[

25

], u =

[31

], v =

[16

]. Express x as a linear combination of u and

v.We must find weights , so that x = u + v, or

[2

5

]=

[31

]+

[16

]=

[3 + + 6

].

Therefore3 + = 2 + 6 = 5

These equations can be solved to yield the weights = 1, = 1. Thus x = u v, as can be readilyverified.

Definition 3. The standard unit vectors in Cn (or Rn) are defined by

e1 =

10...0

, e2 =

01...0

, . . . , en =

00...1

, (3)

• 82 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

The vector ei has its ith component equal to 1 and all other components equal to 0. If x is anyvector in Cn (or Rn), we have

x =

x1x2

...xn

= x1

10...0

+ x2

01...0

+ + xn

00...1

.

orx = x1e1 + x2e2 + + xnen.

Thus every vector x can be expressed as a linear combination of the unit vectors ei weighted by thecomponents of x.

Let us now analyze the solutions of a single linear equation in n unknowns

a1x1 + a2x2 + + anxn = b. (4)

We consider three cases.Case i. Not all ai = 0. Suppose for simplicity that a1 "= 0, then Equation (4) can be written

x1 =b

a1(

a2a1

x2 + +ana1

xn

). (5)

It is clear that x2, x3, . . . , xn can be given arbitrary values, and then Equation (5) determines x1. Thevariables x2, x3, . . . , xn whose values are arbitrary are called free variables, while the variable x1 iscalled a basic variable. In vector form the general solution is

x =

b

a1 a2

a1x2

ana1

xnx2...

xn

=

b

a10...0

+ x2

a2a11...0

+ + xn

ana10...1

,

where x2, x3, . . . , xn are arbitrary.Case ii. All ai = 0, but b "= 0. In this case Equation (4) becomes

0x1 + 0x2 + + 0xn = b "= 0.

This equation can never be satisfied for any choice of the xi. The equation has no solution or isinconsistent.

Case iii. All ai = 0, and b = 0. In this case we have

0x1 + 0x2 + + 0xn = 0.

Each xi can be arbitrarily assigned, thus, every variable is a free variable. Every vector x is a solution.In vector form we can write the general solution as

x =

x1x2

...xn

= x1e1 + + xnen,

• Section 3.2Matrix Notation for Linear Systems 83

where the ei are the standard unit vectors given in (3).Finally we mention that we could have written vectors as rows rather than as columns. In fact,

when we discuss matrices, we will often treat the rows of a matrix as a vector.

Exercises 3.1

1. Given x =

1 2ii3

4 + 5i

, write x in the form u + iv where u and v are real.

2. In each of the following cases express x as a linear combination of u and v, if possible.

a. x =[

13

], u =

[1

1

], v =

[11

]b. x =

130

, u =

1

10

, v =

110

c. x =

131

, u =

1

10

, v =

110

d. x =[

12

], u =

[2

4

], v =

[1

2

]

3. Find the general solution in vector forma. x1 + 3x2 x3 = 7. b. x1 + 2x2 + 0x3 = 2. c. 0x1 + 0x2 = 0.

4. Given the row vectors

r = [1, 1, 0, 2], s = [1, 1, 0, 0], t = [0, 0, 0, 1],

express r as a linear combination of s and t, if possible.

3.2Matrix Notation for Linear Systems

A single linear equation in one unknown, x, is written as

ax = b. (1)

We shall develop a notation so that a general linear system can be written in a compact form similarto (1). For this purpose we need the concept of a matrix.

Definition 1. A matrix is a rectangular array of numbers. If a matrix has m rows and n columns,it is said to have order (or size) m n (read m by n).

Example 1.

A =[

2 1 01 6.4 4

], c =

10

4

, r = [2, 0, 8],

A is a 2 3 matrix, c is a 3 1 column matrix, and r is a 1 3 row matrix.The number located in the ith row and jth column of a matrix is called the (i, j)th element of the

matrix. For example the (2, 3)th element in the matrix A above is 4. A matrix consisting of a singlerow is called a row matrix and is identified with the row vector having the same elements, similarly, amatrix with a single column is called a column matrix and is identified with the column vector havingthe same elements. A 1 1 matrix whose single element is a is identified with the scalar a.

• 84 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

A matrix is usually denoted by a single letter such as A, or Amn, if the order is to be indicated.To describe the elements of a general m n matrix we need to use a double subscript notation.

Amn =

a11 a12 a1na21 a22 a2n...

......

am1 am2 amn

. (2)

Note that aij is the element in the ith row and jth column. We shall also write in the abbreviated form

A = [ aij ] . (3)

Before considering the general case let us consider a single linear equation in n unknowns

a1x1 + a2x2 + + anxn = b. (4)

We define the matrices

A = [a1, a2, . . . , an] , x =

x1x2...

xn

. (5)

Our object is to define the product Ax so that we may write Equation (4) in the compact form Ax = b;this leads us to the following definition

Definition 2. (Row-column product rule). If A is a 1 n row matrix and x is a n 1 columnmatrix, the product Ax is the 1 1 matrix given by

Ax = [a1, a2, . . . , an]

x1x2...

xn

= [a1x1 + a2x2 + + anxn] . (6)

With this definition Equation (4) can be written simply as

Ax = b. (7)

where we have identified the scalar b with the 1 1 matrix [ b ].Now let us consider m equations in n unknowns.

a11x1 + a12x2 + + a1nxn = b1a21x1 + a22x2 + + a2nxn = b2

......

......

am1x1 + am2x2 + + amnxn = bm

(8)

Define the following matrices

A =

a11 a12 a1na21 a22 a2n...

......

am1 am2 amn

, x =

x1x2...

xn

, b =

b1b2...

bm

• Section 3.2Matrix Notation for Linear Systems 85

The matrix A is called the coefficient matrix . Again our object is to define the matrix product Ax sothat (8) can be written in the compact form Ax = b. Notice that the left hand side of the ith equationin (8) is the matrix product of the ith row of A and the column matrix x. Let us write ri(A) for theith row of A. The matrix A can be considered as an array of n-dimensional row vectors.

A =

r1(A)r2(A)

...rm(A)

,

r1(A) = [a11, a12, . . . , a1n]r2(A) = [a21, a22, . . . , a2n]

...rm(A) = [am1, a12, . . . , amn]

Definition 3. The product of an m n matrix A by the n 1 column matrix x is the m 1column matrix Ax defined by

Ax =

r1(A)r2(A)

...rm(A)

x =

r1(A)xr2(A)x

...rm(A)x

=

a11x1 + a12x2 + + a1nxna21x1 + a22x2 + + a2nxn

...am1x1 + am2x2 + + amnxn

With this definition the linear system (8) can be written as

Ax = b.

Example 2.[

2 2 31 0 2

]

132

=[

2(1) + (2)3 + 3 21(1) + 0 3 + 2 2

]=

[23

]

Example 3. Consider the linear equations

x1 2x2 = 1x1 + 7x2 + x3 = 0.

a. Write in matrix form.b. Using matrix multiplication determine if x1 = 1, x2 = 1, x3 = 1 is a solution.c. Determine if x1 = 1, x2 = 0, x3 = 1 is a solution.

Solution a.[

1 2 01 7 1

]

x1x2x3

=[

10

].

b.[

1 2 01 7 1

]

11

1

=[1

7

]"=

[10

],

Therefore x1 = 1, x2 = 1, x3 = 1 is not a solution.c.

[1 2 01 7 1

]

101

=[

10

],

Therefore x1 = 1, x2 = 0, x3 = 1 is a solution.

• 86 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

Matrix notation and matrix multiplication are often useful in organizing data and performingcertain types of calculations. The following is a simple example.

Example 4. A toy manufacturer makes two types of toys, a toy truck and a toy plane. The toytruck requires 3 units of steel, 5 units of plastics and 6 units of labor. The toy plane requires 2 unitsof steel, 6 units of plastic and 7 units of of labor. Suppose the price of steel is \$2 per unit, plastic is\$1 per unit and labor is \$3 per unit. We put the resources needed for each toy in a matrix R and theprices in a column matrix p.

steel plastic labor

R =[

3 5 62 6 7

]truckplane p =

213

unit price of steelunit price of plasticunit price of labor

The components of the product Rp give the cost of making the truck and plane.

Rp =[

3 5 62 6 7

]

213

=[

2931

]

We see that the toy truck costs 29 dollars to produce, and the toy plane 31 dollars.

It is important to realize that one cannot multiply any matrix A by any column vector x; thenumber of columns in the left hand factor A must equal the number of rows in the right hand factor x:Amnxn1 = bm1. Later we will define multiplication of a matrix A by a matrix B, where B has morethan one column. However, in order for AB to be defined, the number of columns in A must equal thenumber of rows in B.

Finally we define two special matrices. First, the zero matrix of order m n is a matrix all ofwhose entries are zero. We denote a zero matrix by Omn or simply by O, if the order is understood.It is easy to show that

Omnxn1 = 0m1, for all x.

Second, the identity matrix of order n n is the matrix

1 0 00 1 0...

. . ....

0 0 1

.

The identity matrix has all of its diagonal elements equal to 1 and its off diagonal elements equal to0. If the order is understood, we simply use I to denote the identity matrix. The following property iseasy to prove

Ix = x, for all x, (9)

where of course I and x must have compatible orders in order that the multiplication is defined.

Exercises 3.2

1. Compute[

1 3 01 1 1

]

214

.

• Section 3.3Properties of Solutions of Linear Systems 87

2. Consider x1 + x2 = 22x1 x2 = 12x1 + x2 = 3

a. Write the equations in matrix form.

b. Determine, using matrix multiplication if x =[

11

]is a solution.

c. Determine, using matrix multiplication if x =[

11

]is a solution.

3. If Ax =[

03

], and x is a 4-dimensional vector, what is the order of A?

4. In Example 4, change the unit prices of steel, plastic and labor to 2, 2, 1, respectively. What isthe cost of making a toy truck? A toy plane?

5. Prove equation (9).

6. Prove or give a counterexample: if A and B are 2 2, and A[

25

]= B

[25

], then A = B.

3.3Properties of Solutions of Linear Systems

Consider a system of m linear equations in n unknowns:

a11x1 + a12x2 + + a1nxn = b1a21x1 + a22x2 + + a2nxn = b2

......

......

am1x1 + am2x2 + + amnxn = bm

(1)

We may write system (1) in the matrix form

Ax = b, (2)

where

A =

a11 a12 a1na21 a22 a2n

......

...am1 am2 amn

, x =

x1x2

...xn

, b =

b1b2...

bm

(3)

The m n matrix A is called the coefficient matrix , the n 1 column matrix or vector x is called theunknown vector or solution vector and the m 1 vector b is called the right hand side vector . If thereexists a vector x such that Ax = b for given A and b, the system is called consistent , otherwise thesystem is called inconsistent . If b = 0 the system Ax = 0 is called homogeneous, otherwise Ax = bwith b "= 0 is called nonhomogeneous.

Before considering properties of solutions, we need the following simple, but important, propertyof the matrix product Ax: If A is an m n matrix, x and y are n 1 column vectors and and arescalars then

A(x + y) = Ax + Ay. (4)

This is called the linearity property . Equation (4) may be proved without difficulty by writing out bothsides.

First we consider properties of solutions of the homogeneous system Ax = 0.

• 88 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

Property 1. The homogeneous equation always has the trivial solution, x = 0 (i.e. x1 = x2 = = xn = 0).

This is obvious since A0 = 0, therefore homogeneous systems are always consistent. The funda-mental question for a homogeneous system Ax = 0 is When does a nontrivial solution exist? Weshall answer this question shortly. However, if a nontrivial solution exists, the following property showsus how to find infinitely many such solutions.

Property 2. If u and v are solutions of Ax = 0 then u + v are also solutions for arbitraryscalars and .

Proof A(u + v) = Au + Av = 0 + 0 = 0.In other words, arbitrary linear combinations of solutions of homogeneous equations are also solutions.

Example 1. Consider Ax = 0 where A =[

1 1 1 21 1 1 3

]. Show u =

0110

and v =

1010

are

solutions. Also verify that u + v are solutions.

Since Au =[

1 1 1 21 1 1 3

]

0110

=[

00

]and Av =

[1 1 1 21 1 1 3

]

1010

=[

00

], we see that u and v

are solutions. Also u + v =

+ 0

and A(u + v) =[

1 1 1 21 1 1 3

]

+ 0

=[

00

].

One should always bear in mind that a nonhomogeneous system need not be consistent. A simpleexample of an inconsistent system is:

x1 x2 + x3 = 1x1 x2 + x3 = 2

.

We now consider the properties of solutions for a consistent nonhomogeneous system Ax = b.

Property 3. If u and v are solutions of the nonhomogeneous system Ax = b, then u v is asolution of the corresponding homogeneous system Ax = 0.

Proof A(u v) = Au Av = b b = 0.

Example 2. Consider Ax = b where A =[

1 1 11 1 1

]and b =

[22

].

u =

110

is a solution since Au =[

1 1 11 1 1

]

110

=[

22

]and v =

200

is also a solution since

Av =[

1 1 11 1 1

]

200

=[

22

]. However u v =

1

10

is a solution of Ax = 0 since A(u v) =

[1 1 11 1 1

]

1

10

=

000

.

Property 4. If xp is a particular solution of Ax = b, and xh is a solution of Ax = 0, then xp+xhis also a solution of Ax = b. Furthermore if xh is the general solution of Ax = 0 then xp + xh is thegeneral solution of Ax = b.

Proof Since Axp = b and Axh = 0, it follows that A(xp + xh) = b + 0 = b, thus xp + xh isa solution. Now to complete the second part of the proof we must show that if x is any solution of

• Section 3.3Properties of Solutions of Linear Systems 89

Ax = b, it can be written in the form x = xp + xh. We know that xp is a solution of Ax = b, and byProperty 4., we have that x xp is a solution of Ax = 0. Therefore we must have x xp = xh, sincexh is the general solution of Ax = 0 and thus x = xp + xh as desired.

Example 3. Consider the single linear equation x1 + x2 3x3 = 2. It is clear that x2 and x3 maybe taken as arbitrary or free variables and then x1, called a basic variable, is determined in terms of x2and x3 by x1 = 2 x2 + 3x3. The general solution in vector form is therefore

x =

2 x2 + 3x3

x2x3

=

200

+ x2

1

10

+ x3

301

.

Letting xp =

200

and xh = x2

1

10

+x3

301

, we see that xp is a (particular) solution of the equation

(obtained by setting x2 = x3 = 0), xh is the general solution of the corresponding homogeneous equationx1 + x2 3x3 = 0 and x = xp + xh is the general solution of the nonhomogeneous equation.

Property 5. If x1 is a solution of Ax = b1 and x2 is a solution of Ax = b2 then 1x1 + 2x2 isa solution of Ax = 1b1 + 2b2.

Proof A(1x1 + 2x2) = 1Ax1 + 2Ax2 = 1b1 + 2b2.

Example 4. Let x1 =

0111

, x2 =

1111

and A =[

1 1 1 21 1 1 3

].

a. Verify that Ax1 =[

23

]and Ax2 =

[34

].

b. Find a solution of Ax =[

12

].

Solution

a. Ax1 =[

1 1 1 21 1 1 3

]

0111

=[

23

], Ax2 =

[1 1 1 21 1 1 3

]

1111

=[

34

].

b. Note that[

12

]= 2

[23

]

[34

], thus x = 2x1 x2 =

1111

must be a solution of Ax =[

12

],

Exercises 3.3

1. Let x1 =

1024

and x2 =

2134

be solutions of Ax =

123

. Without attempting to find the matrix

a. Find infinitely many solutions of Ax = 0. b. Find one solution of Ax =

246

.

c. Find infinitely many solutions of the equation in b..

• 90 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

2. Let x1 =[

13

]be a solution of Ax =

[10

]and x2 =

[1

2

]a solution of Ax =

[01

].

a. Find a solution of Ax =[3

6

].

b. Find a solution of Ax =[

b1b2

]for arbitrary values of b1 and b2.

3. If x1 and x2 are solutions of Ax = b, what equation does x1 + x2 satisfy?

4. Explain why a system of equations Ax = b cannot have exactly two distinct solutions.

3.4Elementary Operations, Equivalent Systems

Two systems of m equations in n unknowns, Ax = b and Ax = b are called equivalent if they have thesame set of solutions, or equivalently, if every solution of Ax = b is a solution of Ax = b and vice-versa.There are three elementary operations on a system of equations which produce an equivalent system.They are type I: interchange of two equations, type II: multiply an equation by a nonzero constant,and type III: add a multiple of one equation to another. These are used in the method of Gaussianelimination to produce an equivalent system which is easy to solve.

Instead of performing elementary operations on the equations of a system Ax = b, we may performelementary row operations on the augmented matrix [A | b]. These elementary row operations are:

Type I. Interchange two rows (ri rj).Type II. Multiply a row by a non-zero constant (cri, c "= 0).Type III. Add a multiple of one row to another (cri + rj).

Definition 1. If a matrix C is obtained from a matrix B by a sequence of elementary rowoperations, then C is said to be row equivalent to B.

The following theorem shows the connection between row equivalence and solving systems of equa-tions.

Theorem 1. If the augmented matrix [A | b] is row equivalent to the augmented matrix [A | b],the corresponding systems Ax = b and Ax = b are equivalent (i.e. have the same solutions).

Proof It is obvious that elementary row operations of types I and II do not change the solutions ofthe corresponding systems. If the row operation of type III is applied to the matrix [A | b], to produce[A | b], only the jth rows differ, say the new jth row is rj = cri + rj . By performing the operationcri + rj = rj , the original jth row is obtained. This implies that any solution of Ax = b is a solutionof Ax = b and conversely.

The following example illustrates how we may use elementary operations on a system to obtainanother system that is easy to solve. Consider the system

x1 + x2 + x3 + x4 = 4x1 + x2 x3 3x4 = 2

3x1 + 3x2 + x3 x4 = 6

The corresponding augmented matrix is

1 1 1 1 41 1 1 3 23 3 1 1 6

.

• Section 3.4Elementary Operations, Equivalent Systems 91

We use the circled element as a pivot and zero out the elements below it by doing the operations r1+r2and 3r1 + r3 to obtain:

1 1 1 1 40 0 -2 4 60 0 2 4 6

. (1)

Next use the circled 2 as a pivot and perform r2 + r3:

1 1 1 1 40 0 -2 4 60 0 0 0 0

. (2)

As we shall see in the next section this matrix is in a row echelon form. If we write down the equationscorresponding to this matrix, it will be clear how to obtain the solutions. The equations are

x1 + x2 + x3 + x4 = 4 2x3 4x4 = 6

0 = 0.

We see that we may take x4 and x2 as arbitrary, or free variables, and then we may obtain x3 and x1,called basic variables, in terms of the free variables. The process of solving the second equation for x3,and substituting it into the first equation, and solving for x1 is called back substitution. Rather thandoing this we shall perform some further row operations on the matrix in (3). First we make the secondpivot equal to one by performing 12r2

1 1 1 1 40 0 1 2 30 0 0 0 0

.

Finally we perform r2 + r1 to get

1 1 0 1 10 0 1 2 30 0 0 0 0

.

This matrix is said to be in reduced row echelon form. If we write the equations corresponding to thisaugmented matrix we get

x1 + x2 x4 = 1x3 + 2x4 = 3

0 = 0.It is now clear that the basic variables x1, x3, corresponding to the pivot columns, may be solved interms of the free variables, x2 and x4, corresponding to the columns without pivots (and to the left ofthe vertical line). We have the so called terminal equations

x1 = 1 x2 + x4x3 = 3 2x4

In vector form the solutions are

x =

1 x2 + x4x2

3 2x4x4

=

1030

+ x2

1100

+ x4

10

21

.

• 92 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

It is a good idea to get into the habit of checking solutions

1 1 1 11 1 1 33 3 1 1

(

1030

+ x2

1100

+ x4

10

21

)=

4

26

+ x2

000

+ x4

000

=

4

26

.

Exercises 3.4

1. If the matrix B is obtained from A by performing an elementary operation of type I, what operationmust be performed on B to get back A? How about an operation of type II?

2. a. Show A is row equivalent to itself.

b. Show that if A is row equivalent to B then B is row equivalent to A.

c. Show that if A is row equivalent to B and B is row equivalent to C then A is row equivalent toC.

3.5Row Echelon Form (REF) and Reduced Row Echelon Form (RREF)

Definition 1. A matrix is said to be in row echelon form (REF) if the following two conditionshold:

1. The first nonzero entries (called pivots) in each row move to the right as you move down therows.

2. Any zero rows are at the bottom.

Example 1.

A =

0 2 2 4 20 0 0 0 00 0 0 0 20 0 0 2 6

, B =

0 2 2 4 20 0 0 2 60 0 0 0 20 0 0 0 0

.

The matrix A is not in row echelon form; it violates both conditions. The matrix B is in REF.

Definition 2. 2. A matrix is said to be in reduced row echelon form (RREF) if it is in rowechelon form and in addition:

3. All the pivots = 1.4. The elements above (and therefore below) each pivot = 0.

Example 2.

C =

0 1 3 2 10 0 0 1 30 0 0 0 20 0 0 0 0

, D =

0 1 3 0 00 0 0 1 00 0 0 0 10 0 0 0 0

.

The matrix C is not in RREF, although it is in REF, while the matrix D is in RREF.

Theorem 1. Every matrix can be reduced to REF or RREF by a sequence of elementary rowoperations.

Proof The algorithm for the reduction is:Part I. Reduction to REF (The Forward Pass):

Step 1. (a)Find the leftmost nonzero column and pick a nonzero entry. If there is noneyou are through.

• Section 3.5Row Echelon Form (REF) and Reduced Row Echelon Form (RREF) 93

(b)Bring this nonzero entry or pivot to the first row by interchanging rows.(c)Zero out all entries below this pivot by adding a multiple of this row to all rows

below it.Step 2. Ignore the first row of the matrix obtained in Step 1. and repeat Step 1 on the

remaining matrix.

Step 3. Ignore the first two rows and repeat Step 1. Continue in this manner until theremaining matrix consists entirely of zero rows or you run out of rows.

Clearly this produces a matrix in REF. To get the matrix into RREF proceed to the next part ofthe algorithm:

Part II. Reduction to RREF (The Backward Pass): Starting with the pivot in the lastnonzero row and working upwards do

Step 4. Zero out all elements above the pivot.Step 5. Make the pivot equal to 1 by dividing each element in the row by the pivot.

This produces a matrix in RREF.One example of the reduction was given in the last section. Here is another example.

A =

0 2 3 13 1 1 12 2 1 1

There are many different ways to use elementary operations to get to a REF. For this simple integermatrix, using hand calculations, we have chosen a way that avoids fractions until the last few steps.

r1 r2

3 1 1 10 2 3 12 2 1 1

,r3 + r1

1 3 0 20 2 3 12 2 1 1

,2r1 + r3

1 3 0 20 2 3 10 8 1 5

,

4r2 + r3

1 3 0 20 2 3 10 0 11 1

This is now in a REF. Now we proceed to the RREF:

111r3

1 3 0 20 2 3 10 0 1 1/11

, 3r3 + r2

1 3 0 20 2 0 14/110 0 1 1/11

, 12r2

1 3 0 20 1 0 7/110 0 1 1/11

,

One last step and we have the RREF.

3r2 + r1

1 0 0 1/110 1 0 7/110 0 1 1/11

.

Although there are many different row echelon forms for a given matrix, the RREF is unique. Wepresent the following theorem without proof.

Theorem 2. If A is a matrix, the RREF of A is unique, that is, it is independent of the sequenceof elementary row operations used to obtain it. In particular the number of nonzero rows in the RREFis also unique.

It follows that the positions of the pivots, and the number of nonzero rows in any REF are unique.

• 94 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

Definition 3. If A is a matrix, the number of nonzero rows in any REF for A is called the rankof A, denoted by rank (A). The columns of a REF that contain the pivots are called basic columns ofA.

The following properties of the rank of a matrix follow easily from the definitions and Theorem 1.Corollary. If A is an m n matrix, then

1. rank (A) = the number of nonzero rows in any REF for A.2. rank (A) = the number of basic columns in any REF for A.3. rank (A) = the number of pivots in any REF for A.4. rank (A) m (the number of rows in A).5. rank (A) n (the number of columns in A).6. If A is row equivalent to B, then rank (A) = rank (B).

Example 3. The 0 matrix has rank 0. There are no basic columns.

Example 4. The n n identity matrix I has rank n. Every column is basic.

Example 5. Consider A =[

1 3 42 7 6

].

To find the rank we reduce it to REF.

2r1 + r2

[1 3 40 1 2

]

Since there are two non zero rows in a REF (A), we have rank (A) = 2. The first two columns are of Aare basic.

Although the rank of a matrix is defined in terms of its RREF, the rank tells us something aboutthe matrix itself. As we shall see later, the rank of a matrix is the number of independent rows in thematrix, which is also equal to the number of independent columns, in fact the basic columns of theoriginal matrix form an independent set.

Exercises 3.5

1. What is the rank of a mn matrix all of whose rows are identical and nonzero?2. What is the rank of a mn matrix all of whose columns are identical and nonzero?3. Describe the RREF for the matrices in the preceding two exercises.

4. By inspection, find the rank of the following:

a.

2 3 4 50 0 2 40 0 4 8

b.

2 3 4 50 0 2 40 0 4 9

5. What is the RREF of an n n matrix of rank n?

3.6Solution of Systems of Equations

It is now easy to describe how to solve a system of equations. Given a system Ax = b, form theaugmented matrix [A | b] and find a REF of [A | b]. From a REF (A) one can see whether or not theequations are consistent.

Theorem 1. The system Ax = b is consistent if and only if any one of the following equivalentconditions hold:

• Section 3.6Solution of Systems of Equations 95

(a) The matrix REF ( [A | b] ) contains no bad rows of the form [0 , 0 , , 0 | c ] with c "= 0.(b) The last column in REF ( [A | b] ) is not a basic column.(c) rank (A) = rank ([A | b]).

Proof (a) A bad row [0, 0, , 0|c] with c "= 0 corresponds to an equation

0 x1 + 0 x2 + + 0 xn = c, c "= 0.

Obviously no choice of the variables will satisfy this equation, thus the system is inconsistent. (b) and(c) are simply restatements of (a).

If the equations are consistent and r = rank (A), then the variables corresponding to the r basiccolumns are basic variables, the remaining n r variables are free variables or arbitrary variables. Wemay solve for the basic variables in terms of the free variables. This can be done from any REF by backsubstitution (this is called solution by Gaussian elimination) or from the RREF using the terminalequations (this is called Gauss-Jordan reduction). The general solution can then be written down invector form. Here are several examples.

Example 1. Consider the system

x1 + x2 = 1x1 + 3x2 = 3x1 + 4x2 = 0.

We proceed to reduce the augmented matrix to REF

1 1 11 3 31 4 0

, r1 + r2r1 + r3

1 1 10 2 20 3 1

, 32r1 + r3

1 1 10 2 20 0 4

The last row of the last augmented matrix is a bad row thus the system is inconsistent.

Example 2. Consider the system

x1 + x2 + x3 + x4 = 4x1 + x2 x3 2x4 = 1 .

We proceed to reduce the augmented matrix to REF[

1 1 1 1 41 1 1 2 1

], r1 + r2

[1 1 1 1 40 0 2 3 5

].

The last matrix is in REF, it is clear here that the basic variables are x1 and x3, the other variables arefree. Instead of solving for x3 and back substituting into the first equation, we shall reduce the matrixto RREF.

12r2

[1 1 1 1 40 0 1 3/2 5/2

],r2 + r1

[1 1 0 1/2 3/20 0 1 3/2 5/2

].

The terminal equations arex1 = 3/2 x2 + x4/2x3 = 5/2 3x4/2.

and the general solution in vector form is

x =

3/2 x2 + x4/2x2

5/2 3x4/2x4

=

3/20

5/20

+ x2

1100

+ x4

1/20

3/21

.

• 96 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

Letting

xp =

3/20

5/20

, v1 =

1100

, v2 =

1/20

3/21

,

we note that xp is a particular solution of the nonhomogeneous system, v1 and v2 are solutions ofthe corresponding homogeneous system, xh = x2v1 + x4v2 is the general solution of the homogeneoussystem and xp + xh is the general solution of the nonhomogeneous system.

Example 3. Consider the system

x1 + 2x2 + x3 = 33x1 x2 3x3 = 12x1 + 3x2 + x3 = 4 .

The augmented matrix and its RREF are (leaving out the details)

[A | b ] =

1 2 1 33 1 3 12 3 1 4

, RREF ([A | b ]) =

1 0 0 30 1 0 20 0 1 4

.

The terminal equations are simply x1 = 3, x2 = 2 and x3 = 4, thus the unique solution vector is

x =

3

24

.

If we are dealing with a homogeneous system Ax = 0, it is not necessary to write the the augmentedmatrix [A | 0 ] since the last column will always remain a zero column when reducing to REF; onesimply reduces A to REF.

Example 4.x1 + x2 = 0x1 x2 = 0

2x1 + 2x2 = 0 .We have (again leaving out the details)

A =

1 11 12 2

, RREF (A) =

1 00 10 0

.

The RREF corresponds to the equations x1 = 0, x2 = 0 and 0 = 0, thus the only solution is the trivialsolution x = 0.

Example 5.x1 + x2 + x3 = 0x1 x2 + x3 = 0

2x1 + 2x3 = 0 .We have

A =

1 1 11 1 12 0 2

, RREF (A) =

1 0 10 1 00 0 0

.

The terminal equations arex1 = x3x2 = 0,

• Section 3.6Solution of Systems of Equations 97

and the general solution is

x =

x3

0x3

= x3

1

01

.

We now collect some facts about solutions in the following theorem.

Theorem 2. If the m n system Ax = b is consistent and r = rank (A) (= rank ([A | b]) thena. if r = n there is a unique solution.b. if r < n there are infinitely many solutions with n r free variables. The general solution has

the formx = xp + xt1v

1 + xt2v2 + + xtnrvnr

where xti is the ith free variable, xp is a particular solution of Ax = b and the vi are solutions of thehomogeneous equation Ax = 0 and the general solution of Ax = 0 is

xh = xt1v1 + xt2v

2 + + xtnrvnr

Although the above theorem holds for any system, it is worthwhile to state the conclusions of thetheorem for homogeneous systems.

Theorem 3. Consider the m n homogeneous system Ax = 0 where r = rank (A), then(a) if r = n the system has only the trivial solution x = 0.(b) if r < n there are infinitely many solutions given in terms of the n r free variables. The

general solution can be written

xh = xt1v1 + xt2v

2 + + xtnrvnr

where the xti are free variables and the vi are solutions of Ax = 0.Thus we see that a non trivial solution of Ax = 0 exists if and only if the rank of A is less than

the number of unknowns. There is one very simple but important case where this occurs given in thefollowing theorem.

Theorem 4. The m n system Ax = 0 with m < n (more unknowns than equations) alwayshas a nontrivial solution.

Proof rank (A) m but m < n, therefore rank (A) < n.

Exercises 3.6

In problems 15 determine whether or not the systems are consistent. If consistent find the generalsolution in vector form and check.

1. 3x + 4y = 49x + 12y = 6

2. x y z + w = 0x + 2y z w = 1x 2y + z/2 + w/2 = 0

3. 3x1 6x2 + 7x3 = 02x1 x2 + x3 = 17x1 + x2 6x3 = 22x1 + 2x2 4x3 = 1

4. x1 + x2 + 2x3 + 2x5 = 1x2 + x3 + x5 = 2

2x2 + 3x3 + x4 + 4x5 = 3

5. x1 + x2 + 2x3 + 2x5 = 1x2 + x3 + x5 = 2x2 + x3 + x4 + 4x5 = 3

x1 + x2 + 2x3 + 2x5 = 1In problems 68 do only as much work as necessary to determine whether or not the equations are

consistent.

6.[

2 4 25 10 1

]7.

[1 5 9 12 6 7 6

]8.

1 0 1 10 1 1 20 1 1 0

• 98 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

9. Solve Ax = 0 for the following matrices A, write the solution in vector form and check.

a.[

0 2 00 1 0

]b.

1 2 32 3 43 4 5

c.

1 3 0 5 6

2 6 2 14 81 3 2 3 10

10. For each of the following matrices A do only as much work as necessary to determine whether ornot Ax = 0 has a nontrivial solution.

a.[

1 2

22 1 4

]b.

2 1 01 0 1

0 1 2

c.

1 22 35 69 27

11. For each coefficient matrix below find the values of the parameter a, if any, for which the corre-sponding homogeneous systems have a nontrivial solution.

a.[

1 22 a

]b.

1 2 32 3 44 5 a

12. Can a system of 3 equations in 5 unknowns have a unique solution? Explain.

3.7More on Consistency of Systems

For a given matrix A it is possible for the system Ax = b to have a solution for some right hand sidevectors b and not have a solution for other b. We wish to characterize those vectors b for which Ax = bis consistent.

Example 1. Find the conditions on b1, b2, b3, if any, such that the following system is consistent.

x1 + 2x2 + 3x3 = b12x1 + 3x2 + 4x3 = b23x1 + 4x2 + 5x3 = b3 .

The augmented matrix and a REF are

[A | b] =

1 2 3 b12 3 4 b23 4 5 b3

, REF ([A | b]) =

1 2 3 b10 1 2 2b1 b20 0 0 b3 2b2 + b1

.

We see that the system has a solution if and only if b3 2b2 + b1 = 0.The following theorem shows one situation when we can guarantee that Ax = b is consistent for

every b.

Theorem 1. If A is a mn matrix then the system Ax = b is consistent for every b if and onlyif rank (A) = m.

Proof If rank (A) = m then every row of any REF (A) contains a pivot, and thus there cannotbe any bad rows in REF ([A | b]) and the system must be consistent. Conversely if the system isconsistent for every b, there can not be a zero row in a REF (A) and thus rank (A) = m.

The existence of solutions in the important special case when the number of equations is the sameas the number of unknowns deserves separate mention.

Theorem 2. If A is an n n matrix, the system Ax = b has a unique solution for every b ifand only if any one of the following hold

1. rank (A) = n.2. RREF (A) = I or A is row equivalent to I.

• Section 3.7More on Consistency of Systems 99

3. Ax = 0 has only the trivial solution.4. Every column of A is a basic column.

Proof From Theorem 1 we know that Ax = b has a solution for every b if and only if rank (A) =n = the number of rows. From Theorem 2 of the last section we have that the solution is unique if andonly if rank (A) = n = the number of columns. This proves statement 1. However we already knowthat statement 1 is equivalent to statements 2, 3 and 4.

The following theorem for the corresponding homogeneous system follow immediately from Theo-rem 2.

Theorem 3. If A is an n n matrix, the system Ax = 0 has a nontrivial solution if and only ifany one of the following hold

1. rank (A) < n.2. RREF (A) "= I or A is not row equivalent to I.3. Ax = b does not have a solution for every b, and if a solution exists for a particular b, it is not

unique.4. At least one column of A is not a basic column.There is another way to look at the matrix product Ax which will need to get our last characteri-

zation of consistency. If A is an m n matrix and x is an n 1 vector we have

Ax =

a11 a12 a1na21 a22 a2n...

......

am1 am2 amn

x1x2

...xn

=

a11x1 + a12x2 + + a1nxna21x1 + a22x2 + + a2nxn

...am1x1 + am2x2 + + amnxn

We may factor out the xis to obtain

Ax =

a11a21...

am1

x1 +

a12a22...

am2

x2 +

a1na2n...

amn

xn. (1)

Thus Ax is a linear combination of the columns of A weighted by the components of x. If we use thenotation cj(A) for the jth column of A we have

Ax = c1(A)x1 + c2(A)x2 + + Anxn. (2)

From Equation 2 we can deduce the following simple criterion for existence of a solution of Ax = b

Theorem 4. If A is an m n matrix then Ax = b has a solution if and only if b is a linearcombination of the columns of A.

Proof From Equation 2, Ax = b can be written as

c1(A)x1 + c2(A)x2 + + cn(A)xn = b. (3)

Clearly a solution exists if and only if b is a linear combination of the columns of A.

Example 2. Suppose A is a 34 matrix and b = 3c1(A)+5c3(A). What is a solution to Ax = b?From Equation 3, we see that Ax = b can be written

c1(A)x1 + c2(A)x2 + c3(A)x3 + c4(A)x4 = b.

• 100 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

Clearly this equation is satisfied by x1 = 3, x3 = 5 and the other xis = 0, so a solution vector is

x =

3050

.

For the homogeneous system Ax = 0, Equation 3 still holds with b = 0

c1(A)x1 + c2(A)x2 + + cn(A)xn = 0 (4)

The following result follows immediately

Theorem 5. If A is an m n matrix., Ax = 0 has a nontrivial solution if and only if at leastone column of A is a linear combination of the other columns.

Example 3. Suppose A is a 3 4 matrix and c1(A) = 5c3(A) 3c2(A). What is a nontrivialsolution to Ax = 0?From Equation 3, we see that Ax = 0 can be written

c1(A)x1 + c2(A)x2 + c3(A)x3 + c4(A)x4 = 0.

Rewriting the given linear relation among the columns we have

c1(A) + 3c2(A) 5c4(A) = 0.

Comparing the two we find a nontrivial solution is x =

130

5

.

In order to improve Theorem 5, we must take a closer look at the information about the columnsof a matrix B that can be obtained from the columns of RREF (B). An example will make it clear. Let

B =

1 1 1 1 41 1 1 3 23 3 1 1 6

.

Then B = RREF (B) is

B =

1 1 0 1 10 0 1 2 30 0 0 0 0

.

The homogeneous system Bx = 0 can be written

c1(B)x1 + c2(B)x2 + + cn(B)xn = 0 (5)

and the system Bx = 0 can be written

c1(B)x1 + c2(B)x2 + + cn(B)xn = 0 (6)

Since the two systems have the same solutions, Equations 5 and 6 show that if any column of B is alinear combination of the other columns, the corresponding column of B is the same linear combinationof the columns of B and conversely. Looking at B, it is easy to see that

c2(B) = c1(B),c4(B) = c1(B) + 2c3(B),

and c5(B) = c1(B) + 3c3(B).

• Section 3.7More on Consistency of Systems 101

Thus we must have, as can easily be verified

c2(B) = c1(B),c4(B) = c1(B) + 2c3(B),

and c5(B) = c1(B) + 3c3(B).

In other words, every nonbasic column of B is a linear combination of the basic columns of B which lieto the left of it and therefore every nonbasic column of B is a linear combination of the basic columnsof B which lie to the left of it. This is true generally as is indicated in the following theorem.

Theorem 6. In any matrix, the nonbasic columns are linear combination of the basic columnsthat lie to the left of it.

The improved version of Theorem 3 may now be stated.

Theorem 7. If A is an m n matrix then Ax = b has a solution if and only if b is a linearcombination of the basic columns of A.

Proof We know that Ax = b is consistent if and only if b is not a basic column. Therefore theresult follows from Theorem 4.

Example 4. Consider the system Ax = b where

1 2 32 3 43 4 5

Determine those vectors b for which the system is consistent.Solution By computing any REF (A) we find that the first two columns are basic columns. There-

fore a solution exists if and only if b is a linear combination of the first two columns of A

b =

123

+

234

=

+ 2

2 + 33 + 4

. (7)

We solved this same example by a different method in Example 1 where we found that the componentsof b must satisfy b3 2b2 + b1 = 0. We find that the components of b given in Equation (7) do indeedsatisfy this relation.

Exercises 3.7

1. If A is an mn matrix and all the columns of A are identical and nonzero, describe those columnsb for which Ax = b is consistent.

2. If A is an m n matrix and all the rows of A are identical and nonzero, describe those columns bfor which Ax = b is consistent.

3. Find the conditions on b1, b2, b3 if any, so that the following systems have solutions. Find thesolutions and check.

a.x1 + x2 + x3 = b1x1 x2 + x3 = b2x1 + x2 x3 = b3

b.x1 + x2 x3 = b1

2x1 x2 + x3 = b24x1 + x2 x3 = b3

• x

xy

y

x + y

Ax

Ax

Ay

Ay

Ax + Ay!

!

!

!

"

""

"

102 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

4. Suppose A is mn and rank (A) = r. What conditions must exist among the three numbers r, mand n in each of the following cases.

a. Ax = 0 has only the trivial solution.

b. Ax = b has a solution for some b but does not have a solution for some other b, but when asolution exists it is unique.

c. Ax = b has a solution for every b, but the solution is not unique.

5. If A is an 3 4 matrix and all the columns of A are identical and nonzero, what is a nontrivialsolution of Ax = 0?

6. If A is 3 5 and c1(A) =

2

13

and rank (A) = 1, describe those vectors b for which Ax = b has

a solution.

7. Suppose A and RREF(A) are

A =

2 4 8 6 30 1 3 2 33 2 0 0 8

, RREF (A) =

1 0 2 0 40 1 3 0 20 0 0 1 1/2

,

a. Express each nonbasic column of A as a linear combination of basic columns.

b. For what vectors b does Ax = b have a solution?

3.8Matrix Algebra

Suppose A is an m n matrix. For each column vector x Cn (or Rn), the product Ax is a vector inCm (or Rm). Thus we may think of multiplication by A (or A itself) as a transformation (or function)which transforms vectors in Cn into vectors in Cm.

Since A0 = 0 we know that the zero vector in Cn is transformed into the zero vector in Cm. Wealso have the linearity property

A(x + y) = Ax + Ay (1)

for all x, y Cn and all scalars ,. Thus linear combinations of x and y are transformed into thesame linear combinations of Ax and Ay. This is illustrated in Figure 1.

Rn Rm

Figure 1

• Section 3.8Matrix Algebra 103

Thinking of A as a transformation provides us with a natural way to define various operations onmatrices. First we need the following fact.

Theorem 1. If A and B are m n matrices, then A = B if and only if Ax = Bx, for all x.

Proof Clearly if A = B, then Ax = Bx. Conversely if Ax = Bx then

Ax = c1(A)x1 + . . . + cn(A)xn = Bx = c1(B)x1 + . . . + cn(B)xn

where cj(A) and cj(B) are the jth columns of A and B respectively. Setting x1 = 1 and x2 = x3 = . . . =xn = 0 we see that c1(A) = c1(B). In a similar way we can show that cj(A) = cj(B) for j = 2, . . . , n.Thus A = B.

Definition 1. The sum of two m n matrices A, B is the m n matrix A + B defined by

(A + B)x = Ax + Bx for all n 1 vectors x. (2)

Theorem 2. If A = [aij ] and B = [bij ] are m n matrices then

A + B = [aij + bij ] (3)

Theorem 2 states that to add two matrices one simply adds corresponding elements.

Proof

(A + B)x = Ax + Bx

=

a11x1 + . . . + a1nxn

...am1x1 + . . . + amnxn

+

b11x1 + . . . + b1nxn

...bm1x1 + . . . + bmnxn

=

(a11 + b11)x1 + . . . + (a1n + b1n)xn

...(am1 + bm1)x1 + . . . + (amn + bmn)xn

=

a11 + b11 . . . a1n + b1n

......

am1 + bm1 . . . amn + bmn

x1...

xn1xn

.

The result now follows from Theorem 1.

Notice that only matrices of the same size can be added.

Definition 2. If A is an m n matrix and is a scalar then A is the m n matrix defined by

(A)x = (Ax) for all x. (4)

Theorem 3. If A = [aij ] then

A = [aij ] = [aij ]. (5)

This states that to multiply a matrix by a scalar, each element is multiplied by the scalar. The proofof Theorem 3 is left to the reader.

• A B = Cm x n n x p m x p

104 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

Example 1.[

1 3 40 5 6

]+ 2

[0 1 45 6 7

]

=[

1 3 40 5 6

]+[

0 2 810 12 14

]=

[1 1 12

10 7 20

]

The properties in the following theorem follow easily from Theorems 2 and 3.

Theorem 4. If A, B, C, O are m n matrices and , are scalars then1. A + B = B + A (commutative law of addition)2. (A + B) + C = A + (B + C) (associative law of addition)3. A + O = A4. ( + )A = A + A5. O = O6. A = O, if = 0.

Suppose x Cp. If B is an n p matrix then Bx Cn. If A is an m n matrix we may formA(Bx) Cm. This leads us to the following definition.

Definition 3. If A is an m n and B is an n p matrix, AB is defined to be the m p matrixsuch that

(AB)x = A(Bx) for all x Cp. (6)Note that the product AB is only defined if the number of columns in the left hand factor is the

same as the number of rows in the right hand factor. The diagram in Figure 2 should be kept in mind.

Figure 2

Theorem 5. If A is an mn and B is an n p matrix then the product AB is an m p matrixand

a. The (i, j)th element of AB = ri(A)cj(B), where

ri(A)cj(B) = [ai1 . . . a1n]

b1j

...bnj

=n

k=1

aikbkj,

{i = 1, . . . , mj = 1, . . . , p

.

b. cj(AB) = Acj(B), j = 1, . . . , p, or, AB = [Ac1(B), . . . , Acp(B)].

c. ri(AB) = ri(A)B, i = 1, . . . , m, or AB =

ri(A)B

...rm(A)B

.

• Section 3.8Matrix Algebra 105

Before proving this theorem let us look at a numerical example

Example 2. Let

A =

1 1 2 0

2 0 1 31 5 2 2

, B =

3 12 21 00 1

.

Using (a) in Theorem 5 we have

C = AB =

1 1 2 0

2 0 1 31 5 2 2

3 12 21 00 1

=

3 3

5 515 7

.

where for instance c32 = r3(A)c2(B) = [1 5 2 2]

1201

= 7.

Using (b) of Theorem 5, let us compute the 2nd column of AB

c2(AB) = Ac2(B) =

1 1 2 0

2 0 1 31 5 2 2

1201

=

3

57

.

Using (c) of Theorem 5, let us find the 3rd row of AB.

r3(AB) = r3(A)B = [1 5 2 2]

3 12 21 00 1

= [ 15 7 ].

Proof of Theorem 5. We first prove b.

(AB)x = A(Bx) = A(c1(B)x1 + . . . + cp(B)xn)= A(c1(B)x1) + . . . + A(cp(B)xn)= (Ac1(B))x1 + . . . + (Acp(B))xn= [Ac1(B), Ac2(B), . . . , Acp(B)]x

thus we haveAB = [Ac1(B), . . . , Acp(B)].

To prove a., we look at the jth column of AB, cj(AB) = Acj(B). We want the ith element in thisjth column

cj(AB) =

r1(A)

...rm(A)

cj(B) =

r1(A)cj(B)

...rm(A)cj(B)

.

Thus the (i, j)th element of AB = ri(A)cj(B).To prove c., we can write the ith row of AB

ri(AB) = [ri(A)c1(B), ri(A)c2(B), . . . , ri(A)cp(B)]= ri(A)[c1(B), c2(B), . . . , cp(B)] = ri(A)B.

• 106 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

Theorem 6. If A, B, C are of the proper orders so that the indicated products are defined then1. A(BC) = (AB)C (associative law of multiplication)2. A(B + C) = AB + AC distributive law3. (A + B)C = AC + BC distributive law4. A(B) = (AB) = (A)B for any scalar 5. AO = O

We shall prove item 1 and then leave the other proofs to the reader.Proof of item 1 . Let D = A(BC) and E = (AB)C. We shall show that Dx = Ex, for all x. We

have, using Definition 3 repeatedly,

Dx = (A(BC))x = A((BC)x) = A(B(Cx)) = (AB)(Cx) = ((AB)C)x = Ex.

There are several important properties that hold for ordinary algebra which do not hold for matrixalgebra:

1. AB is not necessarily the same as BA. Note that if A is m n and B is n p then AB isdefined but BA is not defined unless m = p. Even if AB and BA are both defined, they need not beequal. For example

A =[

2 01 3

]

AB =[

2 27 1

]B =

[1 12 0

]

BA =[

1 34 0

]

2. If AB = 0 it does not necessarily follow that either A = 0 or B = 0. For example[

1 11 1

] [1 11 1

]=

[0 00 0

]

3. If AB = AC it does not necessarily follow that B = C. Note that AB = AC is equivalentto AB AC = 0 or to A(B C) = 0. From fact 2 we cannot conclude that B = C even if A "= 0.

4. If BA = CA it does not necessarily follow that B = C.

Theorem 7. If A is an m n matrix then

AIn = A (7)

ImA = A. (8)

The proof is left to the reader.

Exercises 3.8

1. In a., b. find, if possible (i) 5A 6B, (ii) AB and (iii) BA.

a. A =[

2 11 3

], B =

[4 02 1

]

b. A =

2 1

1 03 1

, B =[

3 1 5 20 4 1 3

]

2. Compute, if possible

a. [1 ,2][

23

][1, 5], b.

[23

][1 2]

[1

5

]c.

[1 3 2

2 5 7

] [0 00 0

]d. [x1, x2 ]

[3 55 3

] [x1x2

]

• Section 3.9Transposes, Symmetric Matrices, Powers of Matrices 107

3. If A =

0 1 00 0 10 0 0

, find A2 AA and A3 A2A.

4. Prove or give a counterexample

a. If the 1st and 3rd rows of A are the same then the 1st and 3rd rows of AB are the same.

b. If the 1st and 3rd columns of A are the same then the 1st and 3rd columns of AB are the same

c. If the 1st and 3rd columns of B are the same then the 1st and 3rd columns of AB are the same.

d. If the 2nd column of B = 0 then the 2nd column of AB = 0.

e. If the first column of A = 0 then the first column of AB = 0.

5. Suppose A is an n n matrix and u, v are n 1 column vectors such that

Au = 3u 2v, Av = 2u 3v

Let the n 2 matrix T be defined by T = [u, v]. Find a matrix B such that AT = TB.6. Write out (A + B)2 = (A + B)(A + B).

7. If A, S, T are n n matrices and TS = I simplify (SAT )3.8. If A64 = [aij ], B46 = [bij ], C = AB and D = BA,

a. write the expressions for c23 and c66, if possible,

b. write the expressions for d43 and d56, if possible.

3.9Transposes, Symmetric Matrices, Powers of Matrices

Definition 1. If A is an m n matrix, the transpose of A, denoted by AT is the n m matrixformed by interchanging the rows and columns of A.

Example 1.

(a) If x =

123

then xT = [ 1 2 3 ]

(b) If A =[

1 2 30 1 5

], AT =

1 02 13 5

(c) If x =

x1x2x3

in R3 then the length of x denoted by |x| is defined by

|x| =

x21 + x22 + x23,

Note that |x|2 = xTx .(d) The equation of a central conic has the form

ax2 + bxy + cy2 = 1.

This can be written[x, y]

[a b/2

b/2 c

] [xy

]= 1.

• 108 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

In matrix notation the equation becomeszT Az = 1

where z =[

xy

], A =

[a b/2

b/2 c

]. Note that A = AT .

Theorem 1. If A is an m n matrix, then

(1) [AT ]ij = [A]ji,{

i = 1, . . . , nj = 1, . . . , m

(2) ri(AT ) = ci(A), i = 1, . . . , n(3) cj(AT ) = rj(A), j = 1, . . . , m(4) (AT )T = A(5) If B is an m n matrix (A + B)T = AT + BT .

Proof Items (1), (2), (3) are simply restatements of the definition; the proofs of the other twoitems are left up to the reader.

We now consider how to take the transpose of a product of matrices.

Theorem 2. If A is an m n, B is n p and x is n 1, then(1) (Ax)T = xT AT

(2) (AB)T = BT AT .

Proof Item (1) can be proved by writing out both sides. For the proof of (2) we have

(AB)T = (A[c1(B), . . . , cp(B)]T

= [Ac1(B), . . . , Acp(B)]T

=

(Ac1(B))T

...(Acp(B))T

=

c1(B)T AT

...cp(B)T AT

=

r1(BT )AT

...rm(BT )AT

= BT AT .

Definition 2. A matrix (necessarily square) is called symmetric if

A = AT (1)

and skew-symmetric ifA = AT . (2)

Example 2.

A =

1 2 3

2 3 43 4 6

= AT is symmetric

B =

0 1 3

1 0 23 2 0

= BT is skew-symmetric.

Example 3. Show that AAT and AT A are symmetric. Note that if A is mn then AAT is mmand AT A is n n. To show that B = AAT is symmetric we must show B = BT . We have

BT = (AAT )T = (AT )T AT = AAT = B.

The symmetry of AT A is shown in the same way.

• Section 3.9Transposes, Symmetric Matrices, Powers of Matrices 109

Powers of a Square MatrixSquare matrices of the same order can always be multiplied. In particular, we can multiply an

n n matrix A by itselfA2 = A A

If we write Am for the product of a matrix by itself m-times (m a positive integer), the following lawholds,

AkAm = Ak+m, k, m positive integers . (3)

A diagonal matrix is a matrix where all the elements are zero except along the main diagonal

D =

1 0 00 2 0...

. . ....

0 0 n

(4)

Powers of a diagonal matrix are easy to find.

Dk =

k1 0 00 k2 0...

. . ....

0 0 kn

(5)

A diagonal matrix with all its diagonal values equal is called a scalar matrix . If is a scalarmatrix, then

=

0 00 0...

. . ....

0 0

=

1 0 00 1 0...

. . ....

0 0 1

= I

where I is the identity matrix. Note that A = AI = A and A = A so that A = A for everymatrix A.

Example 4. Consider the set of three simultaneous linear difference equations of the first order

xn+1 = a11xn + a12yn + a13znyn+1 = a21xn + a22yn + a23znzn+1 = a31xn + a32yn + a33zn

(i)

where n = 0, 1, 2, . . .. Let

wn =

xnynzn

and A =

a11 a12 a13a21 a22 a23a31 a32 a33

Equation (i) can be written in matrix form

wn+1 = Awn, n = 0, 1, 2, . . . (ii)

If w0 is given, thenw1 = Aw0

w2 = Aw1 = A2w0

...

wn = Anw0

• 110 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

So that wn = Anw0 is the solution of (ii). Thus it would be useful to have systematic methods offinding the nth power of a matrix, An. We will develop such methods later.

Example 5. Consider a second order difference equation

xn+2 + axn+1 + bxn = 0, n = 0, 1, 2, . . . (i)

We can reduce such an equation to a system of two first order equations as follows.Let xn+1 = yn, then

yn+1 = xn+2 = axn+1 bxn = ayn bxnThis can be written as a system of first order equations

xn+1 = ynyn+1 = bxn ayn

(ii)

To write (ii) in matrix form let

zn =[

xnyn

], A =

[0 1

b a

]

Now (ii) is equivalent tozn+1 = Azn

Similarly, a difference equation of the kth order may be reduced to a system of k first orderequations, of the form

zn+1 = Azn, n = 0, 1, 2, . . .

where A is a k k matrix, and the solution is

zn = Anz0.

Exercises 3.9

1. Verify the reverse order rule (AB)T = BT AT if

A =[

1 2 32 1 4

], B =

1 10 01 1

.

2. If P, A are n n matrices and A is symmetric, prove that PT AP is symmetric.3. If A, B are n n symmetric matrices, under what condition on A and B is the product AB a

symmetric matrix. Give an example to show the product of symmetric matrices is not always symmetric.

4. If A is square, show that (A + AT )/2 is symmetric (A AT )/2 is skew-symmetric.5. If A is skew symmetric, show that aii = 0.

6. Write the third order difference equation below as a system of first order difference equations inmatrix form

xn+3 + 2xn+2 + 3xn = 0, n = 0, 1, 2, . . .

Hint: Let xn+1 = yn, yn+1 = zn, write an equation for zn+1, then find the matrix A such that wn+1 =

Awn where wn =

xnynzn

.

• Section 3.10The Inverse of a Square Matrix 111

7. If A =[

1 11 1

], find a formula for An for arbitrary n.

8. A square matrix P is called orthogonal if PPT = I. If P, Q are orthogonal matrices, show thatthe product PQ is also an orthogonal matrix.

9. If I, A are n n matrices, simplify

(I A) (I + A + A2 + . . . + Ak).

10. Prove or give a counterexample (all matrices are n n)a. (A + B)(A B) = A2 B2 b. (A + B)2 = A2 + 2AB + B2

c. (I A)(I + A + A2) = (I + A + A2)(I A).

11. a. Show that A2 = I if A =

1 2 2

1 2 11 1 0

.

b. Compute A29 and A30.

12. a. Show that U 2 = U if

U =

2 2 4

1 3 41 2 3

b. Compute Un.

3.10The Inverse of a Square Matrix

Let us look again at the simple scalar equation

ax = b (1)

If a "= 0 we may multiply both sides by a1 and use the fact that a1a = 1 to get the tentative solutionx = a1b. To check that this is indeed the solution we substitute x = a1b into (1)

ax = a(a1b) = (aa1)b = b,

since aa1 = 1. Thus the key point in solving (1) is the existence of a number a1 such that

a1a = aa1 = 1. (2)

We now extend this idea to square matrices.

Definition 1. Let A be a n n matrix. If there exists an n n matrix X such that

AX = XA = I (3)

then X is called the inverse of A and we write X = A1. If A1 exists, A is called nonsingular orinvertible. If A1 does not exist A is called singular .

Example 1. Find A1, if it exists, for

A =[

1 22 3

].

• 112 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

If we let X =[

a bc d

], the condition AX = I is

[1 22 3

] [a bc d

]=

[a + 2c b + 2d2a + 3c 2b + 3d

]=

[1 00 1

].

This yields the two systemsa + 2c = 12a + 3c = 0 , and

b + 2d = 02b + 3d = 1 .

The unique solutions are a = 3, c = 2, b = 2, d = 1. Thus the only candidate for A1 is

X =[3 2

2 1

].

We know that this satisfies AX = I; we need only check that XA = I

[3 2

2 1

] [1 22 3

]=

[1 00 1

].

Therefore we have a unique inverse

A1 =[3 2

2 1

].

Example 2. Find A1, if it exists, for

A =[

1 22 4

].

Let B =[

a bc d

]. The condition AB = I is

[1 22 4

] [a bc d

]=

[a + 2c b + 2d

2a + 4c 2b + 4d

]=

[1 00 1

].

This yieldsa + 2c = 1

2a + 4c = 0, b + 2d = 0

2b + 4d = 1.

Both of these systems are inconsistent (one would be enough). Thus A1 does not exist.These examples show that a square matrix may or may not have an inverse. However a square

matrix cannot have more than one inverse.

Theorem 1. If A1 exists, it must be unique.

Proof Let B, C be two inverses of A. We know that AB = BA = CA = AC = I. It follows that

B = IB = (CA)B = C(AB) = CI = C.

Let us consider the system of n equations in n unknowns

Ax = b (4)

• Section 3.10The Inverse of a Square Matrix 113

where A is an n m matrix and b is an n 1 matrix. If A1 exists we may multiply both sides of (4)on the left to get

A1Ax = A1b

Ix = A1b

x = A1b

(5)

where we have used the fact that A1A = I. This shows that if (4) has a solution, the solutions mustbe given by x = A1b. To see that this is indeed the solution, we substitute (5) into (4)

Ax = A(A1b) = (AA1)b = Ib = b,

where we have used the fact that AA1 = I. Thus we have proved

Theorem 2. If A1 exists then Ax = b has a unique solution for every b and the solution isx = A1b.

Example 3. Suppose A1 is known and is given by

A1 =[

1 12 5

].

(a) Solve A[

x1x2

]=

[13

]for x =

[x1x2

].

(b) Solve [y1, y2]A = [1, 4] for y = [y1, y2]..

Solution (a) x =[

x1x2

]= A1

[13

]=

[1 12 5

] [13

]=

[217

].

(b) y = [y1, y2] = [1, 4]A1 = [1, 4][

1 12 5

]= [7, 21].

A matrix may not have an inverse. The following theorem, which will be proved later, tells uswhen an inverse exists.

Theorem 3. If A is an n n matrix, A1 exists if and only if rank A = n.

Our immediate aim is to develop a method for finding inverses. Given an n n matrix A, thedefinition requires us to find a matrix X satisfying the two conditions AX = I and XA = I. Actually,as indicated in the following theorem, only one of these conditions is needed.

Theorem 4. If A is an nn then A1 exists if and only if there is a matrix X satisfying AX = I.Furthermore, if AX = I then X = A1.

We will prove this theorem later. For now we concentrate on a method for finding inverses.According to Theorem 4, we need only find a matrix X satisfying

AX = I. (6)

Let us introduce the notations

X = [x1, x2, . . . , xn], I = [e1, e2, . . . , en] (7)

where xi = ith column of X and ei is the ith column of I. Equation (6) is now

A[x1 x2, . . . , xn] = [e1, e2, . . . , en].

• 114 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

Using properties of matrix multiplication we get

[Ax1, Ax2, . . . , Axn] = [e1, e2, . . . , en]. (8)

Setting corresponding columns equal we arrive at

Ax1 = e1, Ax2 = e2, . . . , Axn = en. (9)

Thus AX = I holds if and only if equations (9) hold for the columns of X. Since each of the equationsin (9) have the same coefficient matrix A, we may solve all of the n systems in (9) at once by using theaugmented matrix

[A | e1, e2, . . . , en] = [A | I ]. (10)

We know that A1 exists if and only if rank A = n, or equivalently, RREF (A) = I. Thus if A1 existsthe reduced rowechelon form of (10) will be

[I |x1, x2, . . . , xn] = [I , A1]. (11)

If A1 does not exist then rank (A) < n so that at some stage in the reduction of [A | I ] to row echelonform, a zero row will appear in the A part of the augmented matrix.

Example 4. Find A1 and B1 if possible where

A =

1 4 32 5 41 3 2

, B =

1 2 32 3 43 4 5

[A | I] =

1 4 3 1 0 02 5 4 0 1 01 3 2 0 0 1

1 4 3 1 0 00 3 2 2 1 00 7 5 1 0 1

1 0 1/3 5/13 1 00 1 2/3 2/3 1/3 00 0 1/3 11/3 7/3 1

1 0 0 2 1 10 1 0 8 5 20 0 1 11 7 3

.

Thus A1 =

2 1 18 5 2

11 7 3

. The reader may check that AA1 = A1A = I

[B | I] =

1 2 3 1 0 02 3 4 0 1 03 4 5 0 0 1

1 2 3 1 0 00 1 2 2 1 00 2 4 3 0 1

1 2 3 1 0 00 1 2 2 1 00 0 0 1 2 1

.

Because of the zero row on the left hand side of the augmented matrix, B1 does not exist.Proof of Theorem 3. Suppose A1 exists, then from Ax = 0 we find A1Ax = 0 or x = 0 which

implies that rank A = n. Conversely if rank A = n, then Ax = b has a unique solution for every b. Inparticular, let xi be the solution of Ax = ei, for i = 1, . . . , n. Let X = [x1, . . . , xn] then it follows thatAX = I. To show that XA = I, let G = XA I then AG = AXA A = 0. If gi = ith column of G itfollows that Agi = 0. However, since A has rank n, we conclude that gi = 0. Thus G = 0 and XA = I.

Proof of Theorem 4. If A1 exists then rank A = n. Then as in the proof above we can constructa matrix X such that AX = I. Conversely, if there exists an X such that AX = I, then Axi = ei

• Section 3.10The Inverse of a Square Matrix 115

where xi = ith column of X and ei = ith column of I. If b is any column vector we have b =n

i=1

biei.

If x =n

i=1

bixi, we see that

Ax = An

1

bixi =n

1

biAxi =n

1

biei = b.

This shows that Ax = b has a solution for every b which implies that rank A = n. According toTheorem 3 this means that A1 exists.

It is worthwhile to stop for a moment and summarize the various equivalent conditions for nequations in n unknowns to have a unique solution.

Theorem 5. Let A be an n n matrix, then the following statements are all equivalent(a) A1 exists(b) rank A = n(c) RREF (A) = I(d) Ax = 0 implies x = 0(e) Ax = b has a unique solution for every b.

We now consider several properties of inverses.

Theorem 6. If A, B are nonsingular matrices then

(A1)1 = A (12)

(AB)1 = B1A1 (13)

(AT )1 = (A1)T (14)

Proof Property (12) follows from the definition of inverse. To prove (13), let X = B1A1, then

(AB)X = (AB) (B1A1) = A(BB1)A1 = AIA1 = AA1 = I.

According to Theorem (4), X = B1A1 must be the inverse of AB. For (14), let X = (A1)T , then

AT X = AT (A1)T = (A1A)T = IT = I.

Example 5. Simplify(AT (B1A1)T )1

(AT (B1A1)T )1 = (AT (A1)T (B1)T )1 = (AT (AT )1(B1)T )1

= ((B1)T )1 = (BT )1)1 = BT .

Exercises 3.10

For problems 16 find the inverses, if they exist, and check.

1.[

1 23 5

]2.

[2 3

4 6

]3.

1 2 34 5 67 8 9

4.

1 1 11 2 21 2 3

5.

1 2 3 42 3 4 53 4 5 66 7 8 9

6.

1 2 0 31 3 1 2

2 4 1 13 3 0 4

• 116 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

7. Suppose A1 =

1 0 11 1 21 3 5

.

a. Solve Ax =

123

for x =

x1x2x3

b. Solve yT A = [1, 1, 2] for y =

y1y2y3

8. Simplify (B1(BT A)T )1B1AT .

9. Simplify (T1AT )3.

10. Simplify A(A1 + B1)B.

11. Simplify (AT (A1 + B1)T BT )T .

12. If A, B, C, X are n n matrices and A1, B1, C1 exist, solve the following equation for Xand check

B1XCA = AB.

13. Find the matrix X such that X = AX + B where

A =

0 1 00 0 00 0 0

, and B =

1 12 00 3

.

14. Determine when A =[

a bc d

]is nonsingular.

15. If A =[

3 22 6

]find the values of for which A I is singular.

16. If A, B are n n matrices and A1 exists, provea. if AB = O then B = O b. if CA = O then C = O c. if AB = AC then B = C.

3.11Linear Independence and Linear Dependence

Definition 1. The set of vectors {v1, . . . ,vk}, where each vi is an n-dimensional vector, is calledlinearly dependent (LD) if there exist scalars 1, . . . ,k, not all zero, such that

1v1 + 2v2 + . . . + kvk = 0 (1)

For a set to be LD at least one of the 1 in Equation (1) must be non zero. If, say, 1 "= 0, thenwe may solve for v1 as a linear combination of the others

v1 = 21

v2 . . . k1

vk

In general we can state a set of vectors is LD if and only if at least one of the vectors is linearcombinations of the others.

Definition 2. A set of vectors {v1, . . . ,vk} is called linearly independent (LI) if it is not linearlydependent. That is, the set is LI if

1v1 + 2v2 + . . . + kvk = 0 implies 1 = 2 = . . . = k = 0 (2)

• Section 3.11Linear Independence and Linear Dependence 117

From the definition it follows that a set is LI if and only if none of the vectors is a linear combinationof the others.

Example 1. Any set containing the zero vector is LD.Let the set be {0, v2, . . . , vk} then we have the obvious equality

1 0 + 0 v2 + . . . + 0 vk = 0

From definition (1), the set is LD.We emphasize that linear independence or dependence is a property of a set of vectors, not indi-

vidual vectors.The simplest case is if the set contains only one vector, say the set {v}. If v = 0 then, since

1 0 = 0, the set {0} is LD. If v "= 0, one can show that v = 0 implies = 0 so that the set {v} is LI.A set of two vectors, {v1, v2} is LD if and only if one of the vectors is a multiple of the other. For

vectors with real components this can be determined by inspection.

Example 2. The set {v1, v2} where

v1 =

1

23

, v2 =

2

46

is LD since v2 = 2v1.

Example 3. The set {v1, v2} where

v1 =

12

34

, v2 =

24

67

is LI since neither vector is a multiple of the other.

Example 4. The set {v1, v2}, where

v1 =

000

, v2 =

1

23

is LD since it contains the zero vector. Note that v1 is a multiple of v2, namely v1 = 0 v2, but v2 isnot a multiple of v1.

If a set contains three or more vectors, one must usually resort to the definition to determinewhether the set is LI or LD.

Example 5. Is the set {v1, v2, v3} LI or LD where

v1 =

12

03

, v2 =

3241

, v3 =

7685

.

Consider1v1 + 2v2 + 3v3 = 0. (i)

• 118 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

If this vector equation has a nontrivial solution (the unknowns are 1, 2, 3) the set is LD; if it hasonly the trivial solutions the set is LI. Equation (i) can be written

[v1, v2, v3]

123

=

0000

, or

1 3 72 2 6

0 4 83 1 5

123

=

0000

. (ii)

All we need to know is whether or not a nontrivial solution exists. This is determined by the rank ofcoefficient matrix in (ii). If the rank = number of columns = 3, then only the trivial solution existsand the set of vectors is LI. If the rank is less than the number of columns, a non trivial solution existsand the set of vectors is LD. We start to reduce the coefficient matrix in (ii) to echelon form

1 3 72 2 6

0 4 83 1 5

1 3 72 2 6

0 4 83 1 5

1 3 70 4 80 4 80 8 16

1 3 70 4 80 0 00 0 0

At this point it is clear that the rank = 2 < 3 so the set of vectors is LD.

By the same reasoning as in the above example one can prove the following

Theorem 1. Let {v1, . . . ,vk} be a set of n dimensional column vectors and let A be the n kmatrix whose columns are the vi:

A = [v1, . . . ,vk]

If rank A < k the set is LD. If rank A = k the set is LI.

Example 6. Determine whether {v1, v2, v3} is LD or LI where

v1 = [1, 0, 0]T , v2 = [2, 2, 0]T , v3 = [3, 4, 3]T

Let A =

1 2 30 2 40 0 3

. It is clear that rank A = 3 so the set is LI.

Example 7. Assume the set {u, v} is LI where u, v are n-dimensional vectors. Determinewhether the set {u + v,u v} is LI or LD.

Since we are not given specific vectors, we cannot use Theorem 1; we must go back to the definition.Consider the vector equation

(u + v) + (u v) = 0. (i)

If we can show , must be zero, the set {u v, u + v} is LI, otherwise it is LD. Now (i) can bewritten

( + )u + ( )v = 0.

Since we know that {u, v} is LI each coefficient must be zero. Thus + = 0 and = 0. Theseequations are easily solved yielding = = 0 so the set is LI.

• uu u

v

v

v

w

{ u, v } LD { u, v } LI { u , v , w } LD

Section 3.11Linear Independence and Linear Dependence 119

Figure 1

It is instructive to consider the geometric meaning of independence for vectors in the plane, R2. Aset of two vectors {u, v} in R2 is LD if and only if one vector is a multiple of the other. This meansthe vectors must be colinear, as shown in Figure 1. The set {u, v} is LI if the vectors are not colinear.However, if we consider a set of three vectors {u, v, w}, the set must be LD. This is shown abovewhere w can be written as a linear combination of u, v using the parallelogram law.

Let us prove that any set of vectors in R2 containing three or more vectors must be LD.Let the vectors be {v1, . . . ,vm} where m > 2 and the vi are 2 1 vectors. Consider the 2 m

matrixA = [v1, . . . ,vm]

clearly rank A 2 < m = number of columns. Thus by Theorem 1, the set is LD.

Similarly one can show that a set of three-dimensional vectors in R3, {v1, v2, v3} is LI if and onlyif the vectors are not coplanar. Further any set of vectors in R3 containing four or more vectors mustbe LD.

Let us consider vectors in Cn. It is easy to construct sets containing n vectors that are LI.One example is the set {e1, . . . , en} where ei = ith column of the identity matrix of order n. SinceI = [e1, . . . , en] has rank n, the set is LI. Another example is the set {v1, . . . ,vn} where

v1 =

100...0

, v2 =

110...0

, . . . , vn =

111...1

It is clear that the matrix A = [v1, . . . ,vn] has rank n so the set is LI. In fact the columns of any nnmatrix of rank n form a LI set. For sets containing more than n vectors we have

Theorem 2. Any set of vectors in Cn containing more than n vectors must be LD.Proof Let the set be {v1, . . . ,vm} where mn and each vi is n 1 column matrix. Consider the

matrixAnm = [v1, . . . ,vm]

clearly rank A n < m = the number of columns, thus by Theorem 1, the set is LD.

Definition 3. Any set of n vectors in Cn which is LI is called a basis for Cn.

There are of course infinitely many different bases. The choice of the basis depends on the problemat hand. A basis is like a reference frame that we can use to describe all vectors.

• 120 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

Theorem 3. If {v1, . . . ,vn} is a basis for Cn and x is any vector in Cn then there exists uniquescalars i such that

x = 1v1 + 2v2 + . . . + nvn. (3)

The scalar i is called the ith-coordinate of x relative to the given basis.Proof Assume vi are column vectors then (3) is equivalent to

[v1, . . . ,vn]

1

...n

= x. (4)

This is a system of n equations for the n unknowns 1, . . . ,n ( x is given). Since the set {v1, . . . ,vn}is LI the n n coefficient matrix A = [v1, . . . ,vn] has rank n. Therefore, there exists a unique solutionfor the i for every x.

Example 8. Given

x =

22

140

, v1 =

231

, v2 =

3

21

, v3 =

1

21

,

show that {v1, v2, v3} is a basis for R3 and find the coordinates of x relative to this basis.

1v1 + 2v2 + 3v3 = x ()

or

2 3 13 2 21 1 1

123

=

22

140

We now reduce the augmented matrix to echelon form.

2 3 1 223 2 2 141 1 1 0

1 1 1 03 2 2 142 3 1 22

1 1 1 00 5 1 140 1 3 22

1 1 1 00 1 3 220 5 1 14

1 0 4 220 1 3 220 0 16 96

1 0 4 220 1 3 220 0 1 6

1 0 0 20 1 0 40 0 1 6

.

Since the rank of coefficient matrix is 3 the set {v1, v2, v3} is LI and forms a basis. The coordinatesof x relative to this basis are 1 = 2, 2 = 4, 3 = 6. With these values, it can be checked that ()holds.

Rank and Linear IndependenceRecall that the rank of a matrix A is defined to be the number of nonzero rows in the row echelon

form of A. However, the rank also tells us something about the rows of A and the columns of A asindicated in the following theorem.

Theorem 4. Suppose A is an m n matrix of rank r then(1) the maximal number of rows in any LI set of rows is r

• Section 3.11Linear Independence and Linear Dependence 121

(2) the maximal number of columns in any LI set of columns is r.We shall not prove this theorem but will illustrate it with an example.

Example 9. Consider the matrix A given by

A =

1 1 2 3 4 31 1 1 1 1 22 2 1 2 5 11 1 0 0 0 04 4 1 2 5 1

.

By finding the row echelon form of A, one finds that r = rank (A) = 3. According to the theorem,three of the rows are independent. One can show that {r1(A), r2(A), r4(A)} is an LI set of rows. Theother two rows can be expressed as linear combinations of these three rows.

r3(A) = r1(A) + r2(A)r5(A) = r1(A) + r2(A) + 2r4(A)

As far as columns are concerned it can be verified that the set {c1(A), c3(A), c4(A)} is an LI set andthe other columns depend on these:

c2(A) = c1(A)c5(A) = 7c3(A) + 6c4(A)c6(A) = 3c3(A) c4(A).

There are systematic methods for finding which sets of rows or columns form LI sets and how toexpress the remaining rows or columns in terms of the LI sets. However, we will not consider thesematters.

We note two other theorems which follow immediately from Theorem 4.

Theorem 5. If A is any matrix then

rank (A) = rank (AT ).

Theorem 6. If A is an n n (square) matrix, then(1) If the set of columns is LI so is the set of rows.(2) If the set of columns is LD so is the set of rows.(3) A1 exists if and only if the set of columns (or rows) is LI.(4) Ax = 0 has a nontrivial solution if and only if the columns (or rows) are LD.

Example 10. Determine if Ax = 0 has a nontrivial solution where

A =

1 2 32 3 43 4 5

.

If one recognizes that r2(A) = (r1(A)+ r3(A))/2, it is clear that the rows are LD. Thus rank A < 2 anda nontrivial solution exists.

• 122 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

Rank and Independent Solutions of Ax = 0

Let us look at an example of 3 homogeneous equations in 4 unknowns, Ax = 0, where

A =

1 1 1 11 1 1 12 2 0 2

.

To solve this system we find the reduced row echelon form of A

RREF (A) =

1 1 0 10 0 1 00 0 0 0

.

Since the rank = 2 there are 2 basic variables; these are x1 and x3. There must be n r = 4 2 = 2free variables, which are x2 and x4. The general solution is

x =

x2 x4x20x4

= x2

1100

+ x4

1001

. ()

consider the set of two vectors on the right above, namely

1100

,

1001

.

This set is a LI set since, according to equation (), x = 0 if and only if x2 = x4 = 0. This is typical ofthe general situation.

Theorem 7. Let A be an m n matrix of rank r. If the general solution of Ax = 0 (obtainedfrom the echelon form of A) is written in the form

x = t1v1 + . . . + tnrvnr,

where the ti are the free variables, then the set of n r solutions

{v1, . . . ,vnr}

is a LI set.We omit a formal proof of this theorem.

Exercises 3.11

1. Determine by inspection whether the following sets are LI or LD.

a.{[

12

],

[12

]}b.

1234

,

246

10

c. {[1, 0, 0, ], [0, 1, 0]}

d.

0000

,

1000

,

0100

e.

12

12

,

24

24

,

157

10

• Section 3.12Determinants 123

2. Determine whether or not the following sets are LI or LD.

a.

123

,

456

,

789

b.

11

11

,

1111

,

11

11

c.

13

2

,

015

,

271

d.

11

01

,

10

11

,

121

3

3. Given that the set of n-dimensional vectors {u, v, w} is LI determine whether or not the followingsets are LI or LD:

a. {u v, u + w, v w}

b. {u + 2v + 3w, 2u + 3v + 4w, 3u + 4v + 5w}

c. {2u + 3v w, 3u 2v 2w, u + v + w}4. Let x = [2,3, 5]T . Find the coordinates of x relative to the following bases.

a.{e1, e2, e3

}where e1 = [1, 0, 0]T , e2 = [0, 1, 0]T , e3 = [0, 0, 1]T

b.{[1, 2, 3]T , [2, 1, 3]T , [3, 0, 0]T

}

5. Complete the following statements where A is an n n matrixa. A is singular if and only if the rows are

b. A is singular if and only if the columns are

c. If the columns of A are LD then Ax = 0 has

d. If the rows of A are LD then rank A is

e. If the rows of A are LD then Ax = b may have a solution for a particular b but the solution is

f. If columns of A are LD then RREF (A)

6. Find a set of n r LI solutions for Ax = 0 in each of the following cases

a. A =[

1 1 11 1 1

]b. A =

0 0 00 0 00 0 0

c. A =

0 1 00 0 10 0 0

d. A =

1 2 33 4 52 3 4

3.12Determinants

Consider the systema11x1 + a12x2 = b1a21x1 + a22x2 = b2

(1)

Solving by elimination or otherwise, we find that

x1 =b1a22 b2a12

a11a22 a12a21, if a11a22 a12a21 "= 0

x2 =b2a11 b1a21

a11a22 a12a21, if a11a22 a12a21 "= 0

(2)

• a a a a a

a a a a a

a a a a a

11 12

21 2223

3133

11 12 13

21 22

31 32 32

+++---

124 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

Both numerators and denominators have the same form, namely, the difference of two products. Wecan write (2) in a compact form if we define the determinant of any 2 2 matrix A, denoted by det Aor |A| to be a number given by

|A| = deta11 a12a21 a22

= a11a22 a12a21 (3)

This allows us to write (2) as

x1 =

b1 a21b2 a22

|A| , x2 =

a11 b1a22 b2

|A| (4)

Let Bi be the matrix formed by replacing the ith column of A by[

b1b2

], then (3) can be written

x1 =|B1||A| , x2 =

|B2||A| , if |A| "= 0 (5)

This is commonly called Cramers rule.Consider three equations in 3 unknowns, Ax = b, or

3

j=1

aijxj = bi, i = 1, 2, 3 (6)

If we solve for x1, x2, x3, we find that the denominator in each case is

|A| =

a11 a12 a13a21 a22 a23a31 a32 a33

=

a11a22a33 + a12a23a31 + a13a21a32 (a13a22a31 + a12a21a33 + a11a33a22)

(7)

One way to remember this result is to write the first two columns to the right of the determinant, andthen take the product along the diagonals sloping to the right as positive and the products along thediagonals sloping to the left as negative. This scheme is illustrated in the following figure.

Using this definition of a three by three determinant, one may write the solutions of Equation (6),if |A| "= 0 as

xi =|Bi||A| i = 1, 2, 3 (8)

• Section 3.12Determinants 125

where Bi is the matrix formed by replacing ith column of A by b. This is Cramers rule for 3 equations.If |A| "= 0, a unique solution is given by above formula.

We shall define the determinant of an n n matrix A (or an nth order determinant) inductively,that is, we shall define a determinant of an n n matrix in terms of determinants of matrices or order(n 1) by (n 1).

Definition 1. Let A be an n n matrix then(1) the minor Mij , of the (i, j)th element is the determinant of the (n 1) (n 1) matrix formed by

striking out the ith row and jth column of A.(2) The cofactor, Aij , of the (i, j)th element is

Aij = (1)i+jMij

(3) The determinant of A, |A|, is the number defined by

|A| = a11A11 + a12A12 + . . . + a1nA1n. (9)

The definitions of determinants of orders 2 and 3 given earlier are consistent with this definition.

Example 1.

1 2 32 3 13 1 4

= 1(1)2

3 1

1 4

+ (2)(1)32 13 4

+ (3)(1)42 33 1

= 1(12 (+1)) + 2(8 (3)) 3(2 9) = 66.

Example 2.

1 2 1 22 3 1 45 6 7 9

1 0 0 2

= 1

3 1 46 7 90 0 2

2

2 1 26 7 90 0 2

+ 5

2 1 23 1 40 0 2

(1)

2 1 23 1 46 7 9

Each of the determinants or order 3 may be evaluated as in Example 1 or by Formula (7).

In Definition 1. a determinant is defined as the sum of products of elements of the first row by theircorresponding cofactors. For determinants of order 2 or 3, one may verify that one can use elementsof any row or any column instead. This is true in general, as indicated in the following theorem. Theproof is rather tedious and will be omitted.

Theorem 1. If A = [aij ] is an n n matrix then(a) |A| can be expanded in terms of the elements of the ith row,

|A| =n

j=1

aijAij i = 1, . . . , n

(b) |A| can be expanded in terms of the elements of the jth column,

|A| =n

i=1

aijAij j = 1, 2, . . . , n

• 126 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

Example 3. We evaluate the determinant of example 1 by expanding in terms of its 2nd column

1 2 32 3 13 1 4

= (2)

2 13 4

+ 31 33 4

(1)1 32 1

= 66

We now establish some basic properties of determinants.

Property 1. If A is an n n matrix then |A| = |AT |.Proof This follows immediately from Theorem 1.

Property 2. If any row or any column is the zero vector then the value of the determinant iszero.

Proof Expand in terms of the zero row or column.

Property 3. If a determinant has two equal rows (or columns) its value is zero.

Proof This is established inductively. For n = 2 we have (using equal rows).

a ba b

= ab ab = 0.

Assume the result holds for a determinant or order n1. Now expand the determinant (having two equalrows) in terms of one of its non identical rows; each cofactor will be zero by the inductive hypothesis.Thus the determinant equals zero.

We now establish the effect of elementary row (or column) operations on the value of the determi-nant.

Property 4. If any row (or column) is multiplied by c, the value of the determinant is multipliedby c.

Proof Expand in terms of the relevant row (or column).

Corollary |cA| = cn|A|.

Property 5. If a multiple of one row (or column) is added to another row (or column) the valueof the determinant is unchanged.

Proof Let A = [c1, . . . , cn] where ci = ith column of A.Assume that we perform the operation kc1 + cj cj , then by expansion in terms of the jth column wefind

det [c1, . . . , ci, . . . , kci + cj, . . . cn] = det [c1, . . . , ci, . . . , kci, . . . , cn]

+ det [c1, . . . , ci, . . . , cj, . . . , cn].

However,det [c1, . . . , ci, . . . , kci, . . . , cn] = k det [c1, . . . , ci, . . . , ci, . . . , cn] = 0,

since two columns are equal. Thus

det [c1, . . . , ci, . . . , kci + cj, . . . , cn] = det [c1, . . . , ci, . . . , cj, . . . , cn] = det A.

Property 6. If two rows (or columns) of a determinant are interchanged, the value of thedeterminant is multiplied by (1).

• Section 3.12Determinants 127

Proof For simplicity assume we interchange the first two columns. Then we have

det [c2, c1, . . .] = det [c2 c1, c1, . . .] = det [c2 c1, c2, . . .]= det [c1, c2, . . .] = det [c1, c2, . . .]

where we have used Property 5 three times and Property 4 at the last step.Properties 4, 5, 6 are helpful in shortening the work of evaluating a determinant.

Example 4. Evaluate

|A| =

2 3 2 51 1 1 23 2 2 11 1 3 1

perform the elementary row operations: 2r2 + r1 r1,3r2 + r3 r3 and r2 + r4 r4 to get

|A| =

0 1 0 11 1 1 20 5 1 50 2 4 3

.

Now expand by the first column to get

|A| = 1

1 0 15 1 52 4 3

.

Perform c3 + c1 c1 to get

|A| =

0 0 10 1 5

1 4 3

=

0 1

1 4

= 1.

where we expanded the 3rd order determinant by its first row.

Theorem 2. If A is an n n matrix then det A = 0 if and only if the rows (or columns) of Aare LD.

Proof Assume the rows are LD and say r1 =n

2

kiri. Perform the row operation

kiri+r1

r1; this will produce a zero row.Conversely, assume det A = 0. One can show that in reduction of A to reduced rowechelon form

RREF (A), we must havedet A = k det RREF (A) where k "= 0.

Thus det A = 0 implies det RREF (A) = 0. This means RREF (A) has a zero row. Thus rank A < nand the rows are LD.

We are now able to state a condition for the existence of nontrivial solutions to Ax = 0 that willbe useful in the remainder of these notes.

Theorem 3. If A is an n n matrix then Ax = 0 has a nontrivial solution if and only ifdet A = 0.

Proof This follows directly from Theorem 2.

Example 5. x1 x2 = 0, x1 + x2 = 0.

• 128 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

Clearly this system has only the trivial solution, and det1 11 1

= 2 "= 0.

Example 6. x1 2x2 = 0, x1 2x2 = 0.Since the second equation is twice the first, only the first equation need be considered. The solution

is x1 = 2x2 with x2 free. Here we have det[

1 22 4

]= 0.

Theorem 3 may be rephrased to provide another condition for the existence of an inverse.

Theorem 4. If A is an n n matrix then A1 exists if and only if det A "= 0.

Proof If det A = 0 the rows (and columns) are LD and A1 does not exist. If det A "= 0, therows (and columns) of A are LI, rank A = n, and A1 exists.

To complete our discussions of determinants we want to show that Cramers Rule holds in generaland also develop an explicit formula for the inverse of a matrix. We need one preparatory Theorem.

Theorem 5. If A = [aij ] is an n n matrix and Aij is the cofactor of the (i, j)th element then

a.n

j=1

aijAkj ={

det A, if i = k0, if i "= k

b.n

i=1

aikAij ={

det A, if k = j0, if k "= j .

Proof Let us prove (a). Note that when i = k this is the expansion in terms of the ith row givenin Theorem 1. Now let B denote the matrix obtained from A by making the kth row equal to the ith rowof A, leaving all other rows unchanged. Since B has two equal rows, |B| = 0. Moreover, the cofactorsof the kth row of B are the same as the cofactors of the kth row of A. Expanding |B| by its kth rowyields equation (a), where i "= k.

Theorem 6. (Cramers Rule). If A is an n n matrix, the equations Ax = b have a uniquesolution for every b if and only if det A "= 0. If det A "= 0 the solutions are

xi =|Bi||A| .

where Bi is the matrix obtained from A by replacing the ith column of A by b.Proof Writing out Ax = b in full we have

a11x1 + a12x2+ . . . + a1nxn = b1a21x1 + a22x2+ . . . + a2nxn = b1

...an1x1 + an2x2+ . . . + annxn = bn

Multiply the first equation by A11, the 2nd equation by A21, etc., and add to obtain

(A11a11 + A21a21 + . . . + An1an1)x1 =n

k=1

Akibk,

where, because of Theorem 5, the coefficients of x2, . . . , xn are zero. Thus we have

|A| x1 =n

k=1

Akibk = |B1|

• Section 3.12Determinants 129

andx1 =

|B1||A| .

In a similar manner we can show that

xi =n

k=1 Akibk|A| =

|Bi||A| .

Theorem 7. Let A be an n n determinant and define the matrix of cofactors, cof A, to be

(cof A)ij = Aij

then, if |A| "= 0, we have

A1 =(cof A)T

|A| .

Proof In the proof of Theorem 6 we found

xi =n

i=1 Akibk|A| , i = 1, . . . , n

Writing this out in matrix form we have

x1x2

...xn

=1|A|

A11 A21 . . . An1A12 A22 . . . An2

...A1n A2n . . . Ann

b1b2

. . .bn

.

Therefore the coefficient of the vector b on the right must be A1.

A1 =(cof A)T

|A| .

Example 7. Let

A =[

a bc d

]then cof A =

[d c

b a

], (cof A)T =

[d b

c a

].

If |A| = ad bc "= 0 we have

A1 =(cof A)T

|A| =1

[d b

c a

].

This is a useful formula for the inverse of any 2 2 matrix (that has an inverse).

Exercises 3.12

1. Solve Ax = b by Cramers Rule if

A =

0 1 21 0 21 2 2

, b =

123

.

• 130 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

2. Using elementary row or column operations evaluate

det

1 2 3 42 3 4 53 4 5 64 5 6 7

.

3. Using Theorem 7, find A1 for the matrix A in problem 1.

4. For what values of does the equation Ax = x have a nontrivial solution where

A =[

3 41 3

].

For each such find the nontrivial solutions.

Hint: Ax = x is equivalent to Ax x = 0 or (A I)x = 0.

5. Evaluate

det

x a a aa x a aa a x aa a a x

.

3.13Eigenvalues and Eigenvectors, An Introductory Example

Consider the system of two first order linear differential equations.

x1 = 4x1 2x2x2 = 3x1 x2.

(1)

We introduce the vector x and the matrix A defined by

x =[

x1x2

], A =

[4 23 1

]. (2)

The vector x is a function of t. We define the derivative of x by

dxdt

= x =[

x1x2

], (3)

that is, a vector is differentiated by differentiating each component. With this notation, equation (1)may be written as

x = Ax. (4)

One solution of (4) is the trivial solution x(t) 0 (x1(t) 0, x2(t) 0). We are looking fornon-trivial solutions. Since the equations (1) are linear, homogeneous and have constant coefficients, itshould be expected that there exist solutions in the form of exponentials. We look for solutions of theform

x(t) = vet, v =[

v1v2

](5)

• Section 3.13Eigenvalues and Eigenvectors, An Introductory Example 131

where v is a constant vector, v "= 0, and is a scalar. Equation (5) is equivalent to

x1(t) = v1et

x2(t) = v2et(6)

Note we are seeking the simplest kind of exponential solutions where the exponentials in x1 and x2 havethe same exponent. In order to find v and , we substitute (5) into (4). Since

x =d

dt

[v1et

v2et

]= et

[v1v2

]= etv,

we must haveetv = A(vet) = etAv,

or(Av v)et = 0 for all t.

Since et is never zero, we must have

Av v = 0 or Av = v (7)

Thus, , v must satisfy this equation. Note that (7) can be written as

(A I)v = 0 (8)

and is a system of homogeneous algebraic equations. If a value exists for which a non-trivial solution vexists, then is called an eigenvalue of A and the corresponding solution v "= 0 is called an eigenvectorof A corresponding to the eigenvalue .

A non-trivial solution of (8) exists if and only if

det(A I) = 0

This equation determines the eigenvalues. We have

A I =[

4 23 1

][ 00

]=

[4 2

3 1

]

anddet [A I] = (4 ) (1 ) + 6 = 0.

Thus must satisfy the quadratic equation

2 3 + 2 = 0, or ( 2) ( 1) = 0,

so that 1 = 1, 2 = 2 are the eigenvalues of A.For each eigenvalue, we are assured that non-trivial solutions of (8) exist. Putting = 1 = 1 in

(8) and letting the solution be v =[

ab

], we have

(A 1I)v =[

3 23 2

] [ab

]=

[00

].

Therefore is b = 3a/2, with a being arbitrary. In vector form, the solutions are

v =[

ab

]=

[a

3a/2

]= a

[1

3/2

]=

a

2

[23

].

• 132 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

Letv1 =

[23

].

Then v1 is an eigenvector of A corresponding to 1 = 1; of course, any non-zero multiple of v1 is alsoan eigenvector.

Similarly, for 2 = 2, we find

(A 2I)v =[

2 23 3

] [ab

]= 0.

The solutions are b = a, with a arbitrary, or

v =[

ab

]=

[aa

]= a

[11

]= av2 where v2 =

[11

]

Thus v2 is an eigenvector of A corresponding to 2 = 2.Corresponding to 1 = 1, 2 = 2, we have found the solutions

x1 = v1e1t =[

23

]et, x2 = v2e2t =

[11

]e2t.

It is easy to verify thatx = c1x1 + c2x2 = c1v1e1t + c2v2e2t (9)

is a solution of the DE (4) for arbitrary values of the scalars c1, c2. Writing out (9) we have

x = c1[

23

]et + c2

[11

]e2t, (10)

orx1 = 2c1et + c2e2t

x2 = 3c1et + c2e2t. (11)

Suppose we are given an initial condition x(0) = x0 where x0 is a given vector. Setting t = 0 in(9) we find that c1, c2 must satisfy

x0 = c1v1 + c2v2 = c1[

23

]+ c2

[11

]. (12)

Since the eigenvectors v1, v2 are LI, they form a basis for C2, and thus unique constants, c1, c2, canalways be found for any given x

0. Thus, with the solution (10), we can solve any initial value problem.

In fact, the solution is the general solution, as we shall show later.

Let us find the constants c1 and c2 in the case x0 =[

12

]. Equation (12) becomes

x0 =[

12

]= c1

[23

]+ c2

[11

]=

[2 13 1

] [c1c2

].

We can solve these two equations for c1 and c2 by elimination. Alternatively we can use inverse matrices[

c1c2

]=

[2 13 1

]1 [ 12

]=

11

[1 1

3 2

] [12

]=

[1

1

].

Thus c1 = 1, c2 = 2 and the solution satisfying the initial condition is

x1 = et e2t

x2 = 3et e2t.

Exercises 3.13

1. Solve x = Ax, x(0) =[

11

]where A =

[4 1

1 4

].

• Section 3.14Eigenvalues and Eigenvectors 133

2. Find the eigenvalues and eigenvectors for A =[1 2

2 2

].

3.14Eigenvalues and Eigenvectors

Definition 1. If A is an n n matrix, then is called an eigenvalue of A if Av = v has anon-trivial solution; any such non-trivial solution v is called an eigenvector of A corresponding to theeigenvalue .

The equation Av = v can be written

(A I)v = 0. (1)

This has a non-trivial solution if and only if

det (A I) = 0.

Since A I simply substracts from the diagonal elements of A, we have

A I =

a11 a12 . . . a1na22 a22 . . . a2n...

an1 an2 . . . ann

. (2)

Thus det(A I) is a polynomial in degree n. The equation

c() = det(A I) = 0 (3)

is called the characteristic equation of the matrix A. This equation always has at least one complexroot. Therefore, every matrix has at least one eigenvalue and corresponding eigenvector.

Example 1. A =[

4 23 1

]. |A I| =

4 2

3 1

= 0. The characteristic equation is

2 3 + 2 = 0, the eigenvalues are 1 and 2. As we have seen in the last section, the eigenvectors are

for 1 = 1, v1 =[

32

], and for 2 = 2, v2 =

[11

].

Thus we have 2 eigenvalues and 2 eigenvectors. Since det[

3 12 1

]= 1 "= 0, the eigenvectors are LI.

Example 2.

A =[

3 10 3

]

|A I| =3 1

0 3

= (3 )2 0 = 0.

The only eigenvalue is = 3, which has multiplicity 2. The eigenvector is a solution of

(A 3I)v = 0

or [0 10 0

] [v1v2

]=

[00

]

• 134 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

orv2 = 0, v1 arbitrary .

Thus[

10

]is an eigenvector. Here, we have only one eigenvalue and one LI eigenvector.

Example 3. A =[

3 00 3

], |A I| =

[3 0

0 3

]= (3 )2 = 0

Here = 3 is eigenvalue of multiplicity 2. To find the eigenvectors, we solve (A 3I)v = 0or [

0 00 0

] [v1v2

]=

[00

]

Here every vector (except 0) is an eigenvector. In particular, v1 =[

10

]and v2 =

[01

]are

eigenvectors. In this case, we have one eigenvalue of multiplicity 2, and 2 LI eigenvectors correspondingto this single eigenvalue.

One can prove that if 1 is an eigenvalue of multiplicity s, there may exist anywhere from 1 to sLI eigenvectors.

Example 4. For n = 3, consider the following three matrices

(a) A =

2 0 00 2 00 0 2

(b) B =

2 1 00 2 00 0 2

(c) C =

2 1 00 2 10 0 2

.

Each matrix has = 2 as its only eigenvalue, and the multiplicity of the eigenvalue is 3. To seehow many eigenvectors we have, we compute the rank of

(a) A 2I =

0 0 00 0 00 0 0

, B 2I =

0 1 00 0 00 0 0

, C 2I =

0 1 00 0 10 0 0

.

Note that A 2I has rank 0 and has 3 0 = 3 LI eigenvectors, B 2I has rank 1 and has 3 1 = 2LI eigenvectors and C 2I has rank 2 and has 3 2 = 1 LI eigenvectors.

We shall now show that eigenvectors corresponding to distinct eigenvalues are linearly independent.

Theorem 1. If 1, . . . ,k are distinct eigenvalues of A and v1, . . . ,vk, are corresponding eigen-vectors then {v1, . . . ,vk} is a LI set.

Proof If {v1, . . . ,vk} is LD, then at least one of the vectors is a linear combination of the others.Suppose

v1 = 1v2 + . . . + kvk, (4)

where we can assume {v2, . . . ,vk} is LI. Then Av1 = 1Av2 + . . . + kAvk, or

1v1 = 12v2 + . . . + kkvk. (5)

Multiply Equation (4) by 1 and subtract Equation (5) to get

1(1 2)v2 + 2(1 3)v3 + . . . + k(1 k)vk = 0.

Since {v2, . . . ,vk} is LI, we must have

1(1 2) = 0, . . . ,k(1 k) = 0.

• Section 3.14Eigenvalues and Eigenvectors 135

since (1 i) "= 0, it follows that 1 = 2 . . .k = 0.In particular, it follows from Theorem 1 that if an n n matrix has n distinct eigenvalues, the

corresponding n eigenvectors are linearly independent and form a basis for Cn.

Theorem 2. If 1 is an eigenvalue of A then k1 is an eigenvalue of Ak (k a positive integer).Proof Since Av = 1v, we have

A2v = 1Av = 12v...

Akv = 1kv

therefore k1 is an eigenvalue of Ak.

Exercises 3.14

1. Find the eigenvalues and eigenvectors for

a.[

2 53 2

]b.

[1 1

1 1

].

2. Determine with as little computation as possible whether or not 3 is an eigenvalue of each of thefollowing matrices. Give reasons.

a.

1 0 00 2 00 0 0

b.

1 0 00 1 00 0 3

c.

4 1 1

1 4 12 2 5

.

3. Given A =

3 1 12 4 21 1 3

, v1 =

1

10

, v2 =

214

,

a. without finding the characteristic equation determine whether or not v1 is an eigenvector of A, ifso what is the eigenvalue.

b. same as a. for v2

4. Find the eigenvalues and a set of three LI eigenvectors for

a.

3 1 0

1 2 10 1 3

b.

2 1 11 0 11 1 2

c.

0 2 31 1 31 2 2

.

5. Find the eigenvalues and eigenvectors for each of the following

a.

2 1 0

1 2 10 1 2

b.

1 1 1

1 1 11 1 1

c.

7 1 1

1 7 01 1 6

.

6. Find the eigenvalues and eigenvectors (they are complex) for

A =[

1 11 1

].

7. a. If Au = 3u where A is an n n matrix and u is a column vector, find a vector x such thatAx = 2u.b. If 3 is an eigenvalue of a matrix A, what is an eigenvalue of A2 A + I.

• 136 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

8. What are the only possible eigenvlaues of an n n matrix A under the following conditionsa. A2 = 0 (A need not be the zero matrix).

b. A2 = I (A need not be the identity matrix).

c. A2 + 5A + 6I = 0

9. Prove that A1 exists if and only if 0 is not an eigenvalue of A.

10. If A is an n n matrix and u, v are column vectors such that Au = 3u, Av = 2va. evaluate (A2 5A + 6I)u + A2v in terms of u and vb. find a solution x of Ax = u + 5v

11. If A1 exists and is an eigenvalue A with eigenvector v, find an eigenvalue and eigenvector forA1.

3.15Solution of Systems of Differential Equations

We consider a system of n first order, linear, homogeneous differential equations with constant coeffi-cients

x1 = a11x1 + . . . + a1nxnx2 = a21x1 + . . . + a2nxn

...xn = an1x1 + . . . + annxn

(1)

Let x, A be defined by

x =

x1

...xn

, A =

a11 . . . a1n

...an1 . . . ann

. (2)

Using this notation the equations (1) can be written in the form

x = Ax (3)

Let 1, . . . ,n be the eigenvalues of A and v1, . . . ,vn be the corresponding eigenvectors. That isAvi = ivi, i = 1, . . . , n. We assume that A has n linearly independent eigenvectors, that is the set{v1, . . . ,vn} is linearly independent. This will always occur if the n eigenvalues are all distinct, andmay occur if some of the eigenvalues are repeated.

We assume a solution of the formx = vet (4)

where the scalar and the constant non-zero vector v must be determined to satisfy the differentialequation x = Ax. Substituting (4) into the differential equation, we find

x = vet = A(vet) = etAv. (5)

Since et is never zero, we must have

Av = v, v "= 0. (6)

This is just the equation that defines eigenvalues and eigenvectors. We have assumed that there are nlinearly independent eigenvectors v1, . . . ,vn corresponding to the eigenvalues 1, . . . ,n. Thus

x = vieit, i = 1, . . . , n

• Section 3.15Solution of Systems of Differential Equations 137

are n solutions of the differential equation (3). Since the DE is linear and homogeneous, a linearcombination of solutions is again a solution. Thus

x = 1v1e1t + 2v2e2t + . . . + nvnent (7)

is a solution for arbitrary values of 1, . . . ,n.In Theorem 1, we will show that Equation (7) represents the general solution of the DE (3). Supposewe are given that the solution must satisfy and initial condition x(0) = x0 where x0 is a given vector.We put t = 0 in (2) to obtain

x0 = 1v1 + . . . + nvn (8)

and solve for 1, . . . ,n. Since {v1, . . . ,vn} is a set of n linearly independent vectors in Cn, the setforms a basis for Cn. Thus, there exist unique numbers 1, . . . ,n satisfying (8) and the initial valueproblem x = Ax, x(0) = x0 has a solution for every x0.

Example 1. x = Ax, x(0) = x0

where A =

0 1 00 0 1

6 11 6

, x0 =

100

det(A I) = det

1 0

0 16 11 6

= det[

111 6

] 6 det

[1 0

1

]

= ((16 + ) + 11) 6(1) = 3 62 11 6.

The eigenvalues are roots of 3 +62 +11+6 = 0. If there are any integer roots, they must be factorsof the constant term. Thus, we try = 1, 2, 3. We find the eigenvalues are 1 = 1, 2 =2, 2 = 3.Since the eigenvalues are distinct, the three eigenvectors corresponding to these eigenvalues must belinearly independent. the eigenvectors turn out to be

v1 =

1

11

, v2 =

1

24

, v3 =

1

39

The general solution is

x = 1

1

11

et + 2

1

24

e2t + 3

1

39

e3t

orx1 = 1et + 2e2t + 3e3t

x2 = 1et 22e2t 33e3t

x3 = 1et + 42e2t + 93e3t

To satisfy the initial condition, we must solve the following equations for 1, 2, 3

100

= 1

1

11

+ 2

1

24

+ 3

1

39

or

100

=

1 1 1

1 2 31 4 9

123

• 138 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

We find 1 = 3, 2 = 3, 3 = 1. The unique solution satisfying the initial condition is

x1 = 3et 3e2t + e3t

x2 = 3et + 6e2t 3e3t

x3 = 3et 12e2t + 9e3t

Example 2. Find the general solution of x = Ax where

A =

2 2 62 1 3

2 1 1

We find that det(A I) = 3 22 20 24 = 0. By trial and error, we find = 2 is oneroot. Therefore + 2 is a factor of 3 22 20 24. Dividing the cubic by + 2, we obtain2 4 6 = ( + 2) ( 6). Thus = 6 is a simple eigenvalue and = 2 is a double eigenvalue.At this point, we cannot be sure that there are two linearly independent eigenvectors corresponding to = 2, but this turns out to be the case.

To find the eigenvectors corresponding to = 2, we must solve

(A + 2I)v = 0 where A + 2I =

4 2 62 1 3

2 1 3

Reduce this matrix to row echelon form

1 1/2 3/22 1 3

2 1 3

1 1/2 3/20 0 00 0 0

We see that v2, v3 are free and v1 = 12v2 +32v3 and the general solution is

v =

v1v2v3

=

v2/2 + 3v3/2

v2v3

= v2

1/2

10

+ v3

3/201

Thus, the two vectors on the right are linearly independent eigenvectors. To avoid fractions, we takeas eigenvectors corresponding to = 2

v1 =

1

20

, v2 =

302

Corresponding to = 6, we find the eigenvector

v3 =

21

1

.

The general solution is therefore

x = 1

1

20

e2t + 2

302

e2t + 3

21

1

e6t.

• S S S S0 1 2 3

Section 3.15Solution of Systems of Differential Equations 139

Example 3. An infinite pipe is divided into 4 sections S0, S1, S2, S3 as shown.A chemical solution with concentration y1(t) at times t is in section S1 and the same solution has

concentration y2(t) in section S2. The concentrations in section S0 and S3 are assumed to be zeroinitially and remain zero for all time. This is reasonable since S0 and S3 have infinite volume. At t = 0,the concentrations in S1 and S2 are assumed to be y1(0) = a1, y2(0) = a2, where a1 > 0, a2 > 0.

Diffusion starts at t = 0 according to the law: at time t, the diffusion rate between two adjacentsections equals the difference in concentrations. We are assuming that the concentrations in each sectionremain uniform. Thus we obtain

dy1dt

= (y2 y1) + (0 y1)

dy2dt

= (0 y2) + (y1 y2)

ory1 = 2y1 + y2y2 = y1 2y2

In matrix form, we have y = Ay, y(0) = y0, where

y =[

y1y2

], A =

[2 1

1 2

], y0 =

[a1a2

]

The eigenvalues of A are = 1, = 3 and the corresponding eigenvectors are

v(1) =[

11

]and v(2) =

[1

1

]

The general solution is [y1y2

]= 1

[11

]et + 2

[1

1

]e3t

To satisfy the initial conditions, we must have 1 = a1+a22 , 2 =a1a2

2 and the solution is

y1 =a1 + a2

2et +

a1 a22

e3t

y2 =a1 + a2

2et a1 a2

2e3t

We note that y1(t) 0 and y2(t) 0 as t as one would expect. Furthermore, we see that sincea1 > 0 and a2 > 0, we must have y1(t) > 0 and y2(t) > 0 for all t. Since e3t approaches zero morerapidly than et, the concentrations y1(t) and y2(t) will be both almost equal to (a1 + a2)et/2 after ashort time.

Complex Eigenvalues

• 140 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

The eigenvalues may be complex numbers, this will yield eigenvectors with complex numbers aselements and the solution we get by the above method will be the general complex valued solution. IfA is a real matrix, we can write the final solution in real form. In this regard, the following fact isuseful.

If A is a real matrix and = + i is a complex eigenvalue with complex eigenvector w =u + iv (u, v are real vectors), then we(+i)t is a complex valued solution of x = Ax and the real andimaginary parts of we(+i)t are real solutions.

We need to find the real and imaginary parts:

we(+i)t = (u + iv)eteit

= et(u + iv) ( cos t + i sin t)= et(u cost v sint) + iet(u sint + v cost)

We see that the real solutions are

et(u cost v sint) and et(u sint + v cost).

Example 4. Consider

x = Ax, A =[

1 11 1

]

1 11 1

= 0, = 1 i.

Consider (A 1I)w = 0 where 1 = 1 + i, that is[i 11 i

] [w1w2

]=

[00

]

oriw1 + w2 = 0w1 + iw2 = 0

Note the 2nd equation is i times the first equation, so we may use only the first equation. Let w1 = 1then w2 = i and the eigenvector is

w =[

1+i

]=

[10

]+ i

[01

].

Thus u =[

10

], v =

[01

].

The real solutions are et(u cos t v sin t) and et(u sin t + v cos t). Thus, the general real solution is

x = c1et(u cos t v sin t) + c2et(u sin t + v cos t)

orx1 = et(c1 cos t + c2 sin t),x2 = et(c2 cos t c1 sin t).

Finally, as promised earlier, we shall prove that the solution we have obtained, in the case wheren linearly independent eigenvectors exist, is the general solution.

• Section 3.15Solution of Systems of Differential Equations 141

Theorem 1. If A is an n n matrix and has n linearly independent eigenvectors v1, . . . ,vncorresponding to the eigenvalues 1, . . . ,n, then the general solution of x = Ax is

x = 1v1e1t + . . . + nvnent, (9)

where 1, . . . ,n are arbitrary numbers. Furthermore, the initial value problem x = Ax, x(0) = x(0)has a unique solution for every x(0).

Proof Let x be any solution of x = Ax. We make a change of variables by letting x = Ty, whereT is the constant matrix whose columns are the eigenvectors of A

T = [v1, . . . ,vn]. (10)

Since the eigenvectors are linearly independent, T is non-singular. Substituting into the DE we have

x = T y = ATy or y = T1ATy.

Now AT = A[v1, . . . ,vn] = [Av1, . . . , Avn]. Since Avi = ivi, we have

AT = [1v1, . . . , nvn ]

= [v1, . . . , vn ]

1 0 00 2 0...

. . ....

0 0 n

.

Letting D be the diagonal matrix on the right

D =

1 0 00 2 0...

. . ....

0 0 n

,

we haveAT = TD or T1AT = D.

Thus, y satisfies y = Dy which written out is:

y1 = 1y1y2 = 2y2...

yn = nyn.

We can solve each of these equations individually since each equation has only one unknown (suchequations are said to be uncoupled). We have

y1 = e1t1, y2 = e2t2, . . . , yn = ent

where 1, . . . ,n are arbitrary. Since x = Ty we have

x = Ty = [v1, . . . ,vn]

e1t1

...entn

= 1e1tv1 + . . . + nentvn.

• 142 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

Since x was any solution, we have shown that every solution can be written in the above form.If we use the initial condition x(0) = x(0) we find as in (9) above that unique constants 1, . . . ,n

are obtained because of the linear independence of the n eigenvectors. Thus, the initial value problemhas a unique solution.

Exercises 3.15

1. Find the solution of x = Ax, x(0) = x0 in the following cases:

a. A =[

1 13 1

], x0 =

[11

]b. A =

[0 21 1

], x0 =

[1

1

]

2. Find the solution of x = Ax, x(0) = x0 for the following cases

a. A =

3 1 0

1 2 10 1 3

, x0 =

111

b.

1 1 11 1 11 1 1

, x0 =

2

14

3. Find the general solution of x = Ax in each of the following

a. A =

2 1 0

1 2 10 1 2

b. A =

1 1 21 1 21 1 2

4. Find the general solution of x = Ax in real form if A =[1 25 3

].

5. Tank A contains 100 gals. of brine initially containing 50 lbs. of salt. Tank B initially contains200 gals. of pure water. Pure water is pumped into Tank A at rate of one gallon per minute. Thewell-stirred mixture is pumped into Tank B at rate of 2 gals./min. The well-stirred mixture in Tank Bis pumped out at 2 gals./min. Half of this is returned to Tank A and the other half is discarded. Setup the DEs and initial conditions for the amount of salt in each tank at time t.

6. If A is a 3 3 matrix and u, v, w are 3dimensional non-zero vectors such that Au = 3u, Av =0, Aw = 5w

a. What is the general solution of x = Ax?

b. What is the solution of x = Ax, x(0) = 3v + w?7. Find the general solution of

x+ 6x + 11x + 6x = 0by the following procedure. Let

x = yy = z

,

thenz = y = x= 6x 11y 6z.

Let w =

xyz

and write the above equations in the form w = Aw for an appropriate A. Solve for w

and then find x.

8. By the method of problem 7 solve

DE: x 4x x + 4x = 0IC: x(0) = 1, x(0) = 0, x(0) = 0.

• Section 3.16Systems of Linear Difference Equations 143

3.16Systems of Linear Difference Equations

Consider a system of n first order linear homogeneous difference equations

x1(k + 1) = a11x1(k) + . . . + a1nxn(k)...

xn(k + 1) = an1x1(k) + . . . + annxn(k)

(1)

where k = 0, 1, 2, . . . and we have written the unknown sequences as xi(k) rather than (xi)k. Define amatrix A and a sequence of vectors xk by

A =

a11 . . . a1n

...an1 . . . ann

, xk =

x1(k)x2(k)

...xn(k)

. (2)

We may now write (1) asxk+1 = Axk, k = 0, 1, 2, . . . (3)

Assuming x0 is known, we find by successive substitution

x1 = Ax0

x2 = Ax1 = A(Ax0) = A2x0

...

xk = Akx0

(4)

Thus to solve (1) we must evaluate Akx0 for arbitrary k. We shall do this under the assumption thatA has n linearly independent eigenvectors. Assume the eigenvalues of A are 1, . . . ,n and the set ofcorresponding eigenvectors {v1, . . .vn} is LI. The set of eigenvectors is a basis for Cn. Thus there existunique scalars 1, . . . ,n such that

x0 = 1v1 + . . . + nvn (5)

Thus we havexk = Akx0 = Ak(1v1 + . . . + nvn)

= 1Akv1 + . . . + nAkvn.(6)

Since the eigenvalues are i and the corresponding eigenvectors are vi we have Avi = ivi, from whichit follows that Akvi = ki vi for any positive integer k. Therefore, from (6) we obtain

xk = 1k1v1 + . . . + nknv

n, k = 0, 1, 2, . . . (7)

For arbitrary values of the i, equation (7) represents the general solution of (3). If we want to satisfya given initial condition, we put k = 0 in (7) and solve for the i (i.e., use equation (5)).

Example 1. Solve xk+1 = Axk, k = 0, 1, 2, . . . , where

A =[

1 13 1

], x0 =

[12

].

The eigenvalues and eigenvectors are

1 = 2, v1 =[

11

], 2 = 2, v2 =

[1

3

].

• 144 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

Note that the eigenvectors are LI (they must be, since the eigenvalues are distinct). The general solutionis

xk = 12k[

11

]+ 2(2)k

[01

3

].

Setting k = 0 we get

x0 =[

12

]= 1

[11

]+ 2

[1

3

]=

[1 11 3

] [12

].

The solutions of this system are 1 = 5/4, 2 = 1/4. The solution of the difference equation satisfyingthe given initial condition is

xk =542k

[11

]+

14(2)k

[1

3

], k = 0, 1, 2, . . .

If the components of xk are denoted by x1(k), x2(k), these components are

x1(k) =54 2k 1

4(2)k

x2(k) =54 2k + 3

4(2)k.

The Diagonalization MethodThere is an alternative method of evaluating Akx0 which consists of computing the matrix Ak. We

discuss how this can be done, again assuming that A has n LI eigenvectors.Let T be the matrix whose columns are the LI eigenvectors.

T = [v1, . . . ,vn] (8)

Then as shown in the last section (note T1 exists)

A = TDT1 (9)

where D is the diagonal matrix whose diagonal entries are the eigenvalues

D =

1 0 00 2 0...

. . ....

0 0 n

(10)

Now it is easy to show thatAk = TDkT1, k = 0, 1, 2, . . . (11)

where

Dk =

k1 0 00 k2 0...

. . ....

0 0 kn

.

Thus, if the eigenvalues and eigenvectors are known, we have an explicit formula for an arbitrary powerof A.

Example 2. Using the same difference equation as in example 1, we have

T =[

1 11 3

], T1 =

14

[3 1

1 1

]

• Section 3.16Systems of Linear Difference Equations 145

thusA = T DT1 =

[1 11 3

] [2 00 2

]14

[3 1

1 1

]

This can readily be checked. Equation (11) yields

Ak =[

1 11 3

] [2k 00 (2)k

]14

[3 1

1 1

]

=14

[3 2k + (2)k 2k (2)k

3 2k 3(2)k 2k + 3(2)k]

Now the solution satisfying the initial condition is just

xk = Akx0

=14

[3 2k + (2)k 2k (2)k

3 2k 3(2)k 2k + 3(2)k] [

12

]

=14

[5 2k (2)k

5 2k + 3(2)k]

which is the same result obtained in example 1.

Example 3. Each year 1/10 of the people outside California move in and 2/10 of the people insideCalifornia move out. Suppose initially there are x0 people outside and y0 people inside. What is thenumber of people inside and outside after k years. What happens as k .

At the end of the k + 1st year, we have

xk+1 =910

xk +210

yk

yk+1 =110

xk +810

yk

This system can be written in the form zk+1 = Azk where

zk =[

xkyk

], A =

[ 910

210

110

810

]

The eigenvalues and eigenvectors are

1 = 1, v1 =[

21

]2 =

710

. v2 =[

11

]

Solving by either of the methods discussed above we find that

zk =x0 + y0

3

[21

]+

x0 + 2y03

(710

)k[

11

]

As k , we know that (7/10)k 0, thus

limk

zk =x0 + y0

3

[21

]=

[ 2313

](x0 + y0).

Thus, in the long run, there will be 2/3 of the total population outside and 1/3 inside. This is true nomatter what initial distribution of people inside and outside of California may have been. The limitingvector is just an eigenvector of the matrix correspondding to the eigenvalue 1, that is

[ 910

210

110

810

] [ 2313

]= 1

[ 2313

]

• 146 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

If initially there were 23 of the people outside and13 inside, the population distribution would never

change.

Example 4. Solve the 3rd order difference equation

E: xn+3 + 6xn+2 + 11xn+1 + 6xn = 0, n = 0, 1, 2, . . .IC: x0 = 1, x1 = 0, x2 = 0.

We reduce this to a system of three first order equations. Let xn+1 = yn, yn+1 = zn then we have

xn+1 = ynyn+1 = znxn+1 = yn+2 = xn+3 = 6xn+2 11xn+1 6xn

= 6xn 11yn 6zn.

orwn+1 = Awn

where

wn =

xnynzn

, A =

0 1 00 0 1

6 11 6

, w0 =

100

The solution iswn = Anw0, n = 0, 1, 2, . . .

If A has eigenvalues 1, 2, 3 and corresponding linearly independent eigenvectors v1, v2, v3, wehave

w0 = 1v1 + 2v2 + 3v3, ()

wn = Anw0 = An(1v1 + 2v2 + 3v3)

= 1n1v1 + 2n2v

2 + 3n3v3.

In the last section, we found that the eigenvalues where 1 = 1, 2 = 2, 3 = 3 and thecorresponding eigenvectors where

v1 =

1

11

, v2 =

1

24

, v3 =

1

39

Equation () becomes

100

= 1

1

11

+ 2

1

24

+ 3

1

39

Solving, we find 1 = 3, 2 = 3, 3 = 1. Thus the solution of the vector equation is

wn = 3(1)n

1

11

3(2)n

1

24

+ (3)n

1

39

Since all we want is xn, the first component of wn, the solution of our problem is

xn = 3(1)n 3(2)n + (3)n, n = 0, 1, 2, . . .

• Section 3.16Systems of Linear Difference Equations 147

Example 5. Solve zk+1 = Azk where A =[

1 11 1

].

The solution is zk = Akz0. If the eigenvalues are 1, 2 and the corresponding eigenvectors arev1, v3 and are linearly independent, the solution is

zk = 1k1v1 + 2k2v

2

We find that the eigenvalues are1 = 1 + i =

2ei

4

2 = 1 i =

2ei4

The corresponding eigenvectors are

v1 =[

1i

], v2 =

[1

i

]

Thus zk is

zk = 1(

2)k eik4

[1i

]+ 2(

2)k eik

4

[1

i

]()

This is the general complex valued solution. We would like to get the solution in real form. Letthe components of zk be xk, yk. From equation () we get

xk = 2k2 (1eik

k4 + 1eik

4 )

yk = 2k2 (i1eil

4 i2eik

4 )

Using Eulers forms we find

xk = 2k2 ((1 + 2) cos

k

4+ i(1 2) sin

k

4)

yk = 2k2 (i(1 2) cos

k

4 ((1 + 2) sin

k

4).

Let c1 = 1 + 2, and c2 = i(1 2) and we obtain

xk = 2k2 (c1 cos

k

4+ c2 sin

k

4)

yk = 2k2 (c2 cos

k

4 c1 sin

k

4).

If real initial conditions are given for x0, y0 then c1, c2 will be real. If we allow c1, c2 to be arbitraryreal numbers, this is the general real solution.

Exercises 3.16

1. Find the solution of xk+1 = Axk, k = 0, 1, 2, . . . in each of the following cases (use method ofexample 1).

a. A =[

3 32 4

], x0 =

[1

3

]. b. A =

[3 12 0

], x0 =

[12

]

2. Solve problems 1a and 1b using the diagonalization method.

• 148 Chapter IIIMATRICES AND SYSTEMS OF EQUATIONS

3. Find the general solution of xk+1 = Axk if

a. A =

3 1 0

1 2 10 1 3

b. A =

2 1 11 2 11 1 2

4. If A is 3 3 and u, v, w are nonzero column vectors such that Au = 2u, Av = v, Aw = 3wa. Find the general solution of xk+1 = Axk.

b. Find the solution of xk+1 = Axk, x0 = u 5w.5. Solve xk+1 = Axk where

A =[

1 11 1

], x0 =

[1

1

].

6. Solve, using matrix methods E: xk+3 4xk+2 xk+1 + 4xk = 0IC: x0 = 1, x1 = 0, x2 = 1.