4.4 Eigenvalues and Eigenvectors

In this section, we deal with methods for computing eigenvalues and eigenvectors of a matrix $\boldsymbol{A}\in {\mathbb{R}}^{n\times n}$ . First, we discuss a simple power method for computing one or few eigenvalues (Sect. 4.4.1). Next, we concentrate on methods performing the complete eigenanalysis, that is, finding all eigenvalues (the Jacobi, QR, and LR methods in Sects. 4.4.2-4.4.5). Finally, we briefly describe a way to improve already computed eigenvalues and to find the corresponding eigenvector. Additionally, note that eigenanalysis can be also done by means of SVD, see Sect. 4.1.4. For more details on the described as well as some other methods, one can consult monographs by [15,24,52] and [60].

Before discussing specific methods, let us describe the principle common to most of them. We assume that $\boldsymbol{A}\in {\mathbb{R}}^{n\times n}$ has eigenvalues $\vert\lambda_1\vert\ge\vert\lambda_2\vert\ge\ldots\ge\vert\lambda_n\vert$ . To find all eigenvalues, we transform the original matrix $\boldsymbol{A}$ to a simpler matrix $\boldsymbol{B}$ such that it is similar to $\boldsymbol{A}$ (recall that matrices $\boldsymbol{A}$ and $\boldsymbol{B}$ are similar if there is a matrix $\boldsymbol{T}$ such that $\boldsymbol{B} = \boldsymbol{T}^{-1} \boldsymbol{A} \boldsymbol{T}$ ). The similarity of $\boldsymbol{A}$ and $\boldsymbol{B}$ is crucial since it guarantees that both matrices have the same eigenvalues and their eigenvectors follow simple relation: if $\boldsymbol{g}$ is an eigenvector of $\boldsymbol{B}$ corresponding to its eigenvalue $\lambda$ , then $\boldsymbol{T}\boldsymbol{g}$ is an eigenvector of $\boldsymbol{A}$ corresponding to the same eigenvalue $\lambda$ .

There are two basic strategies to construct a similarity transformation $\boldsymbol{B}$ of the original matrix $\boldsymbol{A}$ . First, one can use a series of simple transformations, such as GRs, and eliminate elements of $\boldsymbol{A}$ one by one (see the Jacobi method, Sect. 4.4.2). This approach is often used to transform $\boldsymbol{A}$ to its tridiagonal or upper Hessenberg forms. (Matrix $\boldsymbol{B}$ has the upper Hessenberg form if it is an upper triangular except for the first subdiagonal; that is, $A_{ij} = 0$ for

, where $i,j=1,\ldots,n$ ). Second, one can also factorize $\boldsymbol{A}$ into $\boldsymbol{A} = \boldsymbol{F}_L \boldsymbol{F}_R$ and switch the order of factors, $\boldsymbol{B} = \boldsymbol{F}_R \boldsymbol{F}_L$ (similarity of $\boldsymbol{A}$ and $\boldsymbol{B}$ follows from $\boldsymbol{B} = \boldsymbol{F}_R \boldsymbol{F}_L = \boldsymbol{F}_L^{-1} \boldsymbol{A} \boldsymbol{F}_L$ ). This is used for example by the LR method (Sect. 4.4.5). Finally, there are methods combining both approaches.

4.4.1 Power Method

In its basic form, the power method aims at finding only the largest eigenvalue $\lambda_1$ of a matrix $\boldsymbol{A}$ and the corresponding eigenvector. Let us assume that the matrix $\boldsymbol{A}$ has a dominant eigenvalue ( $\vert\lambda_1\vert > \vert\lambda_2\vert$ ) and

linearly independent eigenvectors.

The power method constructs two series

and $\boldsymbol{x}_i, i \in {\mathbb{N}},$ that converge to $\lambda_1$ and to the corresponding eigenvector $\boldsymbol{g}_1$ , respectively. Starting from a vector $\boldsymbol{x}_0$ that is not orthogonal to $\boldsymbol{g}_1$ , one only has to iteratively compute $\boldsymbol{A}\boldsymbol{x}_i$ and split it to its norm $c_{i+1}$ and the normalized vector $\boldsymbol{x}_{i+1}$ , see Algorithm 9. Usually, the Euclidian ( $c_{i+1} = \Vert\boldsymbol{A} \boldsymbol{x}_i\Vert _2$ ) and maximum ( $c_{i+1} = \max_{j=1,\ldots,n} \vert(\boldsymbol{A}\boldsymbol{x}_i)_j\vert$ ) norms are used.

Algorithm 9

do

     $\boldsymbol{x}_{i+1} = \boldsymbol{A} \boldsymbol{x}_i$
     $c_{i+1} = \Vert \boldsymbol{A} \boldsymbol{x}_{i+1} \Vert$
     $\boldsymbol{x}_{i+1} = \boldsymbol{x}_{i+1} / c_{i+1}$
until a stop criterion holds

Furthermore, the power method can be extended to search also for other eigenvalues; for example, the smallest one and the second largest one. First, if $\boldsymbol{A}$ is nonsingular, we can apply the power method to $\boldsymbol{A}^{-1}$ to find the smallest eigenvalue $\lambda_n$ because $1/\lambda_n$ is the largest eigenvalue of $\boldsymbol{A}^{-1}$ . Second, if we need more eigenvalues and $\lambda_1$ is already known, we can use a reduction method to construct a matrix $\boldsymbol{B}$ that has the same eigenvalues and eigenvectors as $\boldsymbol{A}$ except for $\lambda_1$ , which is replaced by zero eigenvalue. To do so, we need to find a (normalized) eigenvector $\boldsymbol{h}_1$ of $\boldsymbol{A}^{\top}$ corresponding to $\lambda_1$ ( $\boldsymbol{A}$ and $\boldsymbol{A}^{\top}$ have the same eigenvalues) and to set $\boldsymbol{B} = \boldsymbol{A} - \lambda_1\boldsymbol{h}_1\boldsymbol{h}_1^{\top}$ . Naturally, this process can be repeated to find the third and further eigenvalues.

Finally, let us mention that the power method can be used also for some matrices without dominant eigenvalue (e.g., matrices with $\lambda_1 = \ldots = \lambda_p$ for some

$p \le n$ ). For further extensions of the power method see [56], for instance.

4.4.2 Jacobi Method

For a symmetric matrix $\boldsymbol{A}$ , the Jacobi method constructs a series of orthogonal matrices $\boldsymbol{R}_{i}$ , $i \in {\mathbb{N}},$ such that the matrix $\boldsymbol{T}_{i} = \boldsymbol{R}^{\text{T}}_{i} \ldots \boldsymbol{R}^{\text{T}}_{1} \boldsymbol{A} \boldsymbol{R}_{1} \ldots \boldsymbol{R}_{i}$ converges to a diagonal matrix $\boldsymbol{D}$ . Each matrix $\boldsymbol{R}_{i}$ is a GR matrix defined in (4.5), whereby the angle $\alpha$ is chosen so that one nonzero element $(\boldsymbol{T}_{i})_{jk}$ becomes zero in $\boldsymbol{T}_{i+1}$ . Formulas for computing $\boldsymbol{R}_{i}$ given the element

to be zeroed are described in [15], for instance. Once the matrix $\boldsymbol{A}$ is diagonalized this way, the diagonal of $\boldsymbol{D}$ contains the eigenvalues of $\boldsymbol{A}$ and the columns of matrix $\boldsymbol{R} = \boldsymbol{R}_{1} \cdot\ldots\cdot \boldsymbol{R}_{i}$ represent the associated eigenvectors.

There are various strategies to choose an element

which will be zeroed in the next step. The classical Jacobi method chooses the largest off-diagonal element in absolute value and it is known to converge. (Since searching the maximal element is time consuming, various systematic schemes were developed, but their convergence cannot be often guaranteed.) Because the Jacobi method is relatively slow, other methods are usually preferred (e.g., the QR method). On the other hand, it has recently become interesting again because of its accuracy and easy parallelization ([35,67]).

4.4.3 Givens and Householder Reductions

The Givens and Householder methods use a similar principle as the Jacobi method. A series of GRs or HRs, designed such that they form similarity transformations, is applied to a symmetric matrix $\boldsymbol{A}$ in order to transformed it to a tridiagonal matrix. (A tridiagonal matrix is the Hessenberg form for symmetric matrices.) This tridiagonal matrix is then subject to one of the iterative methods, such as the QR or LR methods discussed in the following paragraphs. Formulas for Givens and Householder similarity transformations are given in [52], for instance.

4.4.4 QR Method

The QR method is one of the most frequently used methods for the complete eigenanalysis of a nonsymmetric matrix, despite the fact that its convergence is not ensured. A typical algorithm proceeds as follows. In the first step, the matrix $\boldsymbol{A}$ is transformed into a Hessenberg matrix using Givens or Householder similarity transformations (see Sects. 4.1.3 and 4.4.3). In the second step, this Hessenberg matrix is subject to the iterative process called chasing. In each iteration, similarity transformations, such as GRs, are first used to create nonzero entries in positions

and

for

. Next, similarity transformations are repeatedly used to zero elements

and

and to move these ''nonzeros'' towards the lower right corner of the matrix (i.e., to elements

and

for

). As a result of chasing, one or two eigenvalues can be extracted. If $A_{n,n-1}$ becomes zero (or negligible) after chasing, element $A_{n,n}$ is an eigenvalue. Consequently, we can delete the

th row and column of the matrix and apply chasing to this smaller matrix to find another eigenvalue. Similarly, if $A_{n-1,n-2}$ becomes zero (or negligible), the two eigenvalues of the $2 \times 2$ submatrix in the lower right corner are eigenvalues of $\boldsymbol{A}$ . Subsequently, we can delete last two rows and columns and continue with the next iteration.

Since a more detailed description of the whole iterative process goes beyond the extent of this contribution, we refer a reader to [15] for a shorter discussion and to [24] and [52] for a more detailed discussion of the QR method.

4.4.5 LR Method

The LR method is based on a simple observation that decomposing a matrix $\boldsymbol{A}$ into $\boldsymbol{A} = \boldsymbol{F}_L \boldsymbol{F}_R$ and multiplying the factors in the inverse order results in a matrix $\boldsymbol{B} = \boldsymbol{F}_R \boldsymbol{F}_L$ similar to $\boldsymbol{A}$ . Using the LU decomposing (Sect. 4.1.2), the LR method constructs a matrix series $\boldsymbol{A}_{i}$ for $i\in {\mathbb{N}}$ , where $\boldsymbol{A}_{1} = \boldsymbol{A}$ and

4.4.6 Inverse Iterations

The method of inverse iterations can be used to improve an approximation $\lambda^*$ of an eigenvalue $\lambda$ of a matrix $\boldsymbol{A}$ . The method is based on the fact that the eigenvector $\boldsymbol{g}$ associated with $\lambda$ is also an eigenvector of $\tilde{\boldsymbol{A}} = (\boldsymbol{A} - \lambda^*\boldsymbol{I})^{-1}$ associated with the eigenvalue $\tilde{\lambda} = (\lambda - \lambda^*)^{-1}$ . For an initial approximation $\lambda^*$ close to $\lambda$ , $\tilde{\lambda}$ is the dominant eigenvalue of $\tilde{\boldsymbol{A}}$ . Thus, it can be computed by the power method described in Sect. 4.4.1, whereby $\lambda^*$ could be modified in each iteration in order to improve the approximation of $\lambda$ .

This method is not very efficient without a good starting approximation, and therefore, it is not suitable for the complete eigenanalysis. On the other hand, the use of the power method makes it suitable for searching of the eigenvector $\boldsymbol{g}$ associated with $\lambda$ . Thus, the method of inverse iterations often complements methods for complete eigenanalysis and serves then as a tool for eigenvector analysis. For this purpose, one does not have to perform the iterative improvement of initial $\lambda^*$ : applying the power method on $\tilde{\boldsymbol{A}} = (\boldsymbol{A} - \lambda^*\boldsymbol{I})^{-1}$ suffices. See [40,52] and [60] for more details.