The purpose of this section is to discuss the nature of
inverse problems and approaches to their solution.
There are a wide range of methods
for solving inverse problems, two of which
are particularly
considered in this section:
the Tikhonov method because of its importance and
the fact that it helps us illustrate the general approach to
inversion methods and the method of eigenfunction expansion which
will facilitate the understanding of the operator-splittimg method
that will be introduced in section 4.
The nature of the diffusion problem and its inverse is illustrated
in figure 1. The set X is the domain of K and Y is the range,
where X and Y are subsets of normed linear spaces.
The arrow labelled (a) shows the well-posedness of the forward operation;
any f Î X maps onto a unique g Î Y (K is injective).
The arrows labelled
(b) illustrate the non-uniqueness in the inverse problem;
there is a set of solutions { f : K f=g } (K is generally not
surjective). The arrow
labelled (c) shows that a given g Ï Y, for example the g
may be a member of Y, but for the presence of noise. In this
final case no f Î X exists such that Kf=g.
It is illuminating to consider the analogies between the
solution of inverse problems and the problem of solving the linear
algebraic equation where the matrix is
rank deficient (singular). For example consider the following
equation:
é ê
ê ë
1
0
1
0
ù ú
ú û
x = b .
(3)
For any given x*, matrix multiplication can be used to obtain
the unique forward solution b*. For example x*=[1,2]T Þb*=[1,1]T. However, when b is known and x
is sought the fact that the matrix is singular means that there
may be no solution and if there is a solution it is not unique.
For example setting b*=[1,1]T Þ x*=[1,a]T
where a takes any value, the solution is non-unique. The
fact that the second component of x* does not effect the
result of the forward operation is equivalent to the loss of
information in the forward operator discussed earlier.
Furthermore, if the vector b is perturbed, as it would be if the
data were affected by noise, then there may be no solution x.
For example if b = [1.01,0.99]T then there is no
corresponding solution x.
In developing a method for solving inverse problems the
aim is to formulate the problem in such a way as to guarantee a
unique solution (to the formulated problem).
To ensure existence of the solution, the data
g could be smoothed to bring it into the range
of K. Another approach is to widen the set of feasible solutions
to { f : || K f - g || < e}
where e( > 0) is a measure of the noise in g.
To ensure uniqueness, a typical approach is seek an optimal solution
in the space of feasible solutions.
Examples of optimal solutions
that are commonly used are minimum norm, minimum energy or
maximum entropy. In the minimum norm example the inverse solution
is defined as follows:
minf Î X { ||f|| : || K f - g || < e} .
(4)
The minimum norm method can be regarded as a constrained optimization
problem. Using Lagrange multipliers and dropping the constant e
allows us to substitute the problem (4) by the problem
of minimizing the functional
||f||2 + m||K f-g||2
(5)
and taking the solution limit as
m® ¥.
The second term in (5) enforces the constraint as a penalty
function. Usually m is fixed as a parameter (i.e. the limit
m® ¥ is not taken) and the minimum found
is taken as the inverse solution. Although some work is reported in the
area, the selection of the value of m is not straightforward
in practice.
Reformulating the inverse problem in the form (5)
is an example of Tikhonov regularization wherein a linear sum
of the error functional ||K f-g||2 and a
smoothing functional (||f||2 in (5))
is minimised in order to determine the inverse solution.
Many inversion methods are based on the
general approach of minimising the combination of an error
and smoothing functionals. A discussion of such methods is given in
Chapter 18 of Press et al [11].
The matrix equivalent of the Tikhonov regularization method
(5) is the minimum norm solution of the
rank deficient least squares problem (see section 5.5 of [3]).
This can be illustrated through returning to the matrix equation
(3). If b = [1,1]T then the solution x
of the range of solutions [1,a]T
with minimum norm is x¢ = [1,0]T.
The minimum norm solution of the matrix equation (3)
illustrates a fundamental property of the
solution of inverse problems that does not seem to be pointed
out in the literature and it is highly important to our
understanding and analysis of inversion methods. It is that if the
forward operator K is applied to a given function
f to return a function g then the inversion method is applied
to g, we should not expect the original function f
to be returned.
In the simple matrix example (3), the
original x=[1,2]T but the minimum norm inverse
solution is x=[1,0]T.
In order to develop the method of partial eigenvalue expansion,
it must first be assumed that K is a linear operator. Let
{li,fi: i=1,2,... }
be the ordered set of eigenfunctions of K. Hence given any g
we may write
g(x) =
¥ å
i=1
ai fi(x) ,
or for any e > 0. Let g be approximated using its first
m eigenfunctions so that
g(x) =
m å
i=1
ai fi(x) + d(x) ,
where ||d|| < e.
Now putting
f =
m å
i=1
ai
li
fi
ensures that f is a feasible solution of ||K f-g|| < e.
The truncation of the eigenvalue expansion of g automatically
tends the method toward the selection of a smooth solution.
In effect, the inversion methods are developed through substituting
the original K f=g equation by a nearby problem that has
a unique solution. Accounting for noise in g using (4)
widens the set of feasible solutions. Through seeking an optimal
solution in this set or a solution that can be written as
a finite eigenvalue expansion, a deliberate bias has been included so
that the resulting solution will tend to have the desirable properties
that have been determined a priori. In practice, the
original f will not be known. In a mathematically determined test problem,
it is natural to start from an original f, and compare the
inverse solution to it. However, the presence of the original f
is also a priori information that biases the observer to expect
one inverse solution rather than another.
If the function g has been generated
from an original function f then that f is not generally the
solution of the nearby problem and we should not expect the
numerical inverse solution to converge to it or necessarily regard
the recovery of the original f as the aim of the inversion method.