A system of linear equations can be written in the matrix notation as
There are two basic classes of methods for solving system (4.6). The first class is represented by direct methods. They theoretically give an exact solution in a (predictable) finite number of steps. Unfortunately, this does not have to be true in computational praxis due to rounding errors: an error made in one step spreads in all following steps. Classical direct methods are discussed in this section. Moreover, solving an equation system by means of matrix decompositions, as discussed in Sect. 4.1, can be classified as a direct method as well. The second class is called iterative methods, which construct a series of solution approximations that (under some assumptions) converges to the solution of the system. Iterative methods are discussed in Sect. 4.3. Finally, note that some methods are on the borderline between the two classes; for example, gradient methods (Sect. 4.3.5) and iterative refinement (Sect. 4.2.2).
Further, the direct methods discussed in this section are not
necessarily optimal for an arbitrary system (4.6). Let us
deal with the main exceptions. First, even if a unique solution
exist, numerical methods can fail to find the solution: if the number
of unknown variables is large, rounding errors can accumulate and
result in a wrong solution. The same applies very much to systems with
a nearly singular coefficient matrix. One alternative is to use
iterative methods (Sect. 4.3), which are less sensitive to
these problems. Another approach is to use the QR or SVD
decompositions (Sect. 4.1), which can transform some
nearly singular problems to nonsingular ones. Second, very large
problems including hundreds or thousands of equations and unknown
variables may be very time demanding to solve by standard direct
methods. On the other hand, their coefficient matrices are often
sparse, that is, most of their elements are zeros. Special strategies
to store and solve such problems are discussed in
Sect. 4.5.
To conclude these remarks, let us mention a close relation between
solving the system (4.6) and computing the inverse
matrix
:
In the rest of this section, we concentrate on the Gauss-Jordan elimination (Sect. 4.2.1) and its modifications and extensions, such as iterative refinement (Sect. 4.2.2). A wealth of information on direct methods can be found in monographs [3,15] and [24].
In this subsection, we will simultaneously solve the linear systems
![]() |
The Gauss-Jordan elimination (GJ) is based on elementary operations that do not affect the solution of an equation system. The solution of (4.7) will not change if we perform any of the following operations:
GJ is a technique that applies one or more of these elementary
operations to (4.7) so that
becomes the identity
matrix
. Simultaneously, the right-hand side becomes the set
of solutions. Denoting
,
, the matrices
corresponding to the
th row operation, the combination of all
operations has to constitute inverse
and hence
. The exact choice of these elementary
operation is described in the following paragraph.
Let us now discuss several well-known variants of the Gauss-Jordan
elimination. GJ without pivoting does not interchange any rows or
columns; only multiplication and addition of rows are
permitted. First, nonzero nondiagonal elements in the first column
are eliminated: the first row of (4.7) is divided by
its diagonal element
and the
-multiple of the
modified first row is subtracted from the
th row,
. We can proceed the same way for all
columns of
,
and thus, transform
to the identity matrix
. It is easy
to see that the method fails if the diagonal element in a column to be
eliminated, the so-called pivot, is zero in some step. Even if this is
not the case, one should be aware that GJ without pivoting is
numerically unstable.
On the other hand, the GJ method becomes stable when using pivoting. This means that one can interchange rows (partial pivoting) or rows and columns (full pivoting) to put a suitable matrix element to the position of the current pivot. Since it is desirable to keep the already constructed part of the identify matrix, only rows below and columns right to the current pivot are considered. GJ with full pivoting is numerically stable. From the application point of view, GJ with partial pivoting is numerically stable too, although there are artificial examples where it fails. Additionally, the advantage of partial pivoting (compared to full pivoting) is that it does not change the order of solution components.
There are various strategies to choose a pivot. A very good choice is the largest available element (in absolute value). This procedure depends however on the original scaling of the equations. Implicit pivoting takes scaling into account and chooses a pivot as if the original system were rescaled so that the largest element of each row would be equal to one.
Finally, let us add several concluding remarks on efficiency of GJ and
its relationship to matrix decompositions. As shown, GJ can
efficiently solve problems with multiple right-hand sides known in
advance and compute
at the same time. On the other hand, if
it is necessary to solve later a new system with the same coefficient
matrix
but a new right-hand side
, one has to start the
whole elimination process again, which is time demanding, or compute
using the previously computed inverse
matrix
, which leads to further error accumulation. In
praxis, one should prefer matrix decompositions, which do not have
this drawback. Specifically, the LU decomposition (Sect. 4.1.2)
is equivalent to GJ (with the same kind of pivoting applied in both
cases) and allows us to repeatedly solve systems with the same
coefficient matrix in an efficient way.
In the introduction to Sect. 4.2, we noted that direct
methods are rather sensitive to rounding errors. Iterative refinement
offers a way to improve the solution obtained by any direct method,
unless the system matrix
is too ill-conditioned or even
singular.
Let
denote an initially computed (approximate) solution
of (4.6). Iterative refinement is a process constructing
a series
,
as described in
Algorithm 7. First, given a solution
, the
residuum
is computed. Then, one obtains the
correction
by solving the original system with
residuum
on the right-hand side.
It is reasonable to carry out the computation of residuals
Additionally, an important feature of iterative refinement is its low
computational costs. Provided that a system is solved by means of
decompositions (e.g., GJ is implemented as the LU decomposition),
a factorization of
is available already after computing the
initial solution
. Subsequently, solving any system with the
same coefficient matrix
, such as
, can be done fast and efficiently and the computational costs
of iterative refinement are small.