Introduction to Chemical Engineering Processes/Numerical Root Finding Methods

Basics of Rootfinding
Rootfinding is the determination of solutions to single-variable equations or to systems of n equations in n unknowns (provided that such solutions exist). The basics of the method revolve around the determination of roots

A root of a function $$ F(x_1,x_2,...) $$ in any number of variables is defined as the solution to the equation $$ F(x_1,x_2,...) = 0 $$. In order to use any of the numerical methods in this section, the equation should be put in a specific form, and this is one of the more common ones, used for all methods except the iterative method.

However, it is easy to put a function into this form. If you start with an equation of the form:

$$ F_1(x_1,x_2,...) = F_2(x_1,x_2,...) $$

then subtracting $$ F_2 $$ will yield the required form. Do not forget to do this, even if there is only a constant on one side!

Since any equation can be put into this form, the methods can potentially be applied to any function, though they work better for some functions than others.

Analytical vs. Numerical Solutions
An analytical solution to an equation or system is a solution which can be arrived at exactly using some mathematical tools. For example, consider the function $$ y = ln(x) $$, graphed below.



The root of this function is, by convention, when $$ y = 0 $$, or when this function crosses the x-axis. Hence, the root will occur when $$ ln(x) = 0 \rightarrow x = e^0 = 1 $$

The answer x=1 is an analytical solution because through the use of algebra, we were able to come up with an exact answer.

On the other hand, attempting to solve an equation like:

$$ -x = ln(x) $$

analytically is sure to lead to frustration because it is not possible with elementary methods. In such a case it is necessary to seek a numerical solution, in which guesses are made until the answer is "close enough", but you'll never know what the exact answer is.

All that the numerical methods discussed below do is give you a systematic method of guessing solutions so that you'll be likely (and in some cases guaranteed) to get closer and closer to the true answer. The problem with numerical methods is that most are not guaranteed to work without a good enough initial guess. Therefore, it is valuable to try a few points until you get somewhere close and then start with the numerical algorithm to get a more accurate answer. They are roughly in order from the easiest to use to the more difficult but faster-converging algorithms.

Iterative solution
Iterative solutions in their purest form will solve the desired function so that it is in the form:

$$ x = f(x) $$

Then, a value for x is guessed, and f(x) is calculated. The new value of x is then re-inserted into f(x), and the process is repeated until the value of x changes very little.

The following example illustrates this procedure.

This method has some rather severe limitations as we'll see in this example:

This example shows that the success of the iteration method strongly depends on the properties of the function on the right-hand side. In particular, it has to do with how large the slope of the function is at the root. If the slope is too large, the method will not converge, and even if it is small the method converges slowly. Therefore, it is generally undesirable to use this method, though some more useful algorithms are based on it (which is why it is presented here).

Iterative Solution with Weights
Although the iterative solution method has its downfalls, it can be drastically improved through the use of averaging. In this method, the function is still solved for x in the form:

$$ x = f(x) $$

From the initial guess $$ x_0 $$, the function f(x) is used to generate the second guess $$ x_1 $$. However, rather than simply putting $$ x_1 $$ into f(x), a weighted average of $$ x_0 $$ and $$ x_1 $$ is made:

$$ x_1(New)= \alpha*x_0 + (1-\alpha)*x_1(old), 0 \le \alpha \le 1 $$

The term $$ \alpha $$ is called the weight. The most common value of the weight is one-half, in which case the next value to plug into f(x) is simply the average of $$ x_0 $$ and $$ x_1(old) $$:

This new value is then plugged into f(x), averaged with the result, and this is repeated until convergence.

The following examples show that this method converges faster and with more reliability than normal iterative solution.

The method is not only faster-converging but also more stable, so that it can actually be used solving the equation the other way too.

Notice that in this case, if we use regular iteration the result only converged if the equation was solved in a certain way. Using weighted iteration, it is possible to solve it either way and obtain a solution, but one way is clearly faster than the other. However, weighting will accelerate the algorithm in most cases and is relatively easy to implement, so it is a worthwhile method to use.

Bisection Method
Let us consider an alternative approach to rootfinding. Consider a function f(x) = 0 which we desire to find the roots of. If we let a second variable $$ y = f(x) $$, then y will (almost always) change sign between the left-hand side of the root and the right-hand side. This can be seen in the above picture of $$ y = ln(x) $$, which changes from negative to the left of the root $$ x = 1 $$ to positive to its right.

The bisection method works by taking the observation that a function changes sign between two points, and narrowing the interval in which the sign change occurs until the root contained within is tightly enclosed. This only works for a continuous function, in which there are no jumps or holes in the graph, but a large number of commonly-used functions are like this including logarithms (for positive numbers), sine and cosine, and polynomials.

As a more formalized explanation, consider a function $$ y = f(x) $$ that changes sign between $$ x = a $$ and $$ x = b $$ We can narrow the interval by:


 * 1) Evaluating the function at the midpoint
 * 2) Determining whether the function changes signs or not in each sub-interval
 * 3) If the continuous function changes sign in a sub-interval, that means it contains a root, so we keep the interval.
 * 4) If the function does not change sign, we discard it. This can potentially cause problems if there are two roots in the interval,so the bisection method is not guaranteed to find ALL of the roots.

Though the bisection method is not guaranteed to find all roots, it is guaranteed to find at least one if the original endpoints had opposite signs.

The process above is repeated until you're as close as you like to the root.

Note that convergence is slow but steady with this method. It is useful for refining crude approximations to something close enough to use a faster but non-guaranteed method such as weighted iteration.

Regula Falsi
The Regula Falsi method is similar the bisection method. You must again start with two x values between which the function f(x) you want to find the root of changes. However, this method attempts to find a better place than the midpoint of the interval to split it.It is based on the hypothesis that instead of arbitrarily using the midpoint of the interval as a guide, we should do one extra calculation to try and take into account the shape of the curve. This is done by finding the secant line between two endpoints and using the root of that line as the splitting point.

More formally:


 * Draw or calculate the equation for the line between the two endpoints (a,f(a)) and (b,f(b)).
 * Find where this line intersects the x-axis (or when y = 0), giving you x = c
 * Use this x value to evaluate the function, giving you f(c)
 * The sub-intervals are then treated as in the bisection method. If the sign changes between f(a) and f(c), keep the interval; otherwise, throw it away. Do the same between f(c) and f(b).
 * Repeat until you're at a desired accuracy.

Use these two formulas to solve for the secant line y = mx + B:

The regula falsi method is guaranteed to converge to a root, but it may or may not be faster than the bisection method, depending on how long it takes to calculate the slope of the line and the shape of the function.

In some cases, the regula falsi method will take longer than the bisection method, depending on the shape of the curve. However, it generally worth trying for a couple of iterations due to the drastic speed increases possible.

Tangent Method (Newton's Method)
In this method, we attempt to find the root of a function y = f(x) using the tangent lines to functions. This is similar to the secant method, except it "cuts loose" from the old point and only concentrates on the new one, thus hoping to avoid hang-ups such as the one experienced in the example.

Since this class assumes students have not taken calculus, the tangent will be approximated by finding the equation of a line between two very close points, which are denoted (x) and $$ (x+\delta x) $$. The method works as follows:


 * 1) Choose one initial guess, $$ x_1 $$
 * 2) Evaluate the function f(x) at $$ x = x_1 $$ and at $$ x = x_1 + \delta x $$ where $$ \delta x $$ is a small number. These yield two points on your (approximate) tangent line.
 * 3) Find the equation for the tangent line using the formulas given above.
 * 4) Find the root of this line. This is $$ x_2 $$
 * 5) Repeat steps 2-4 until you're as close as you like to the root.

This method is not guaranteed to converge unless you start off with a good enough first guess, which is why the guaranteed methods are useful for generating one. However, since this method, when it converges, is much faster than any of the others, it is preferable to use if a suitable guess is available.