Jacobi and Gauss-Seidel Iterations

The Jacobi iteration is an iterative method for solving linear equation system ${\bf A}{\bf x}={\bf b}$, if ${\bf A}$ satisfies a certain condition. Specifically, we first decompose ${\bf A}$ in the following way:

$\displaystyle {\bf A}={\bf D}+{\bf R}$ (206)

where ${\bf D}=diag[a_{11},\cdots,a_{NN}]$ is a diagonal matrix containing all diagonal elements of ${\bf A}$ with all off-diagonal elements being zero, and ${\bf R}={\bf A}-{\bf D}$ contains all off-diagonal elements of ${\bf A}$ with all diagonal elements being zero. Now the given system can be expressed as

$\displaystyle {\bf A}{\bf x}=({\bf D}+{\bf R}){\bf x}={\bf b},
\;\;\;\;$i.e.$\displaystyle \;\;\;\;
{\bf D}{\bf x}=({\bf A}-{\bf R}){\bf x}={\bf b}-{\bf R}{\bf x}$ (207)

Premultiplying both side by ${\bf D}^{-1}=diag[1/a_{11},\cdots,1/a_{NN}]$, we get

$\displaystyle {\bf x}={\bf D}^{-1}({\bf b}-{\bf R}{\bf x})
={\bf D}^{-1}{\bf b}-{\bf D}^{-1}{\bf R}{\bf x}
={\bf z}+{\bf B}{\bf x}$ (208)

where we have defined

$\displaystyle {\bf z}={\bf D}^{-1}{\bf b},\;\;\;\;\;\;\;
{\bf B}=-{\bf D}^{-1}{\bf R}=-{\bf D}^{-1}({\bf A}-{\bf D})={\bf I}-{\bf D}^{-1}{\bf A}$ (209)

This equation can be turned into an iteration:

$\displaystyle {\bf x}_{n+1}={\bf D}^{-1}({\bf b}-{\bf R}{\bf x}_n)={\bf z}+{\bf B}{\bf x}_n$ (210)

In component form we have

$\displaystyle x^{(n+1)}_i=\frac{1}{a_{ii}}\left(b_i-\sum_{j\ne i} a_{ij} x^{(n)}_j\right)
\;\;\;\;\;\;\;\;\;(i=1,\cdots,N)$ (211)

or

$\displaystyle \left\{\begin{array}{lllllll}
x^{(n+1)}_1=(b_1& &-a_{12}x^{(n)}_2...
...1&-a_{N2}x^{(n)}_2&\cdots&-a_{N,N-1}x^{(n)}_{N-1}& &)/a_{NN}
\end{array}\right.$ (212)

We now show that this iteration will converge to the solution of the equation system if ${\bf B}$ satisfies certain condition. We first write ${\bf x}_n$ as the sum of the true solution ${\bf x}$ and an error term:

$\displaystyle {\bf x}_n={\bf x}+{\bf e}_n$ (213)

then we have

$\displaystyle {\bf x}_{n+1}={\bf B}{\bf x}_n+{\bf z}={\bf B}({\bf x}+{\bf e}_n)...
...f x}+{\bf z}+{\bf B}{\bf e}_n
={\bf x}+{\bf B}{\bf e}_n ={\bf x}+{\bf e}_{n +1}$ (214)

where we have used the identity ${\bf x}={\bf B}{\bf x}+{\bf z}$ given above, and ${\bf e}_{n+1}={\bf B}{\bf e}_n$ is the error after the (n+1)th iteration, which can be further written as

$\displaystyle {\bf e}_{n+1}={\bf B}{\bf e}_n={\bf B}^2{\bf e}_{n-1}
=\cdots={\bf B}^{n+1}{\bf e}_0$ (215)

Obviously if the error ${\bf e}_n$ converges to zero:

$\displaystyle \lim\limits_{n\rightarrow \infty} {\bf e}_{n}
=\lim\limits_{n\rightarrow \infty}{\bf B}^n{\bf e}_0={\bf0}$ (216)

Now we further consider the condition under which the error does indeed converge to zero. Based on the eigen-equation of ${\bf B}$:

$\displaystyle {\bf B}{\bf\phi}_i=\lambda_i{\bf\phi}_i,\;\;\;\;\;\;(i=1,\cdots,N)$ (217)

we further have

$\displaystyle {\bf B}^n{\bf\phi}_i=\lambda^n_i{\bf\phi}_i$ (218)

As in general the $N$ eigenvectors are independent, they form a basis (not necessarily orthogonal) of the N-D space by which any vector, such as ${\bf e}_0$, can be written as a linear combination of these vectors:

$\displaystyle {\bf e}_0=\sum_{i=1}^N a_i {\bf\phi}_i$ (219)

Now we have

$\displaystyle \lim\limits_{n\rightarrow \infty}{\bf e}_n
=\lim\limits_{n\righta...
...i=1}^N a_i \left(\lim\limits_{n\rightarrow \infty}\lambda_i^n\right){\bf\phi}_i$ (220)

If the spectral radius (the eigenvalue with maximum absolute value) $\rho({\bf A})<1$ is smaller than 1, then

$\displaystyle \lim\limits_{n\rightarrow \infty}\lambda_i^n=0 \;\;\;\;\; (i=1,\cdots,N)$ (221)

and the error ${\bf e}_n$ converge to zero:

$\displaystyle \lim\limits_{n\rightarrow \infty}{\bf B}^n{\bf e}_0={\bf0}$ (222)

A sufficient (but not necessary) condition for $\rho({\bf B})<1$ is that ${\bf A}$ is strictly diagonally dominant, i.e.,

$\displaystyle \vert a_{ii}\vert>\sum_{j\ne i}\vert a_{ij}\vert$ (223)

The iteration may converge even if this condition is not satisfied.

The coefficient matrix ${\bf A}$ can be alternatively decomposed into ${\bf A}={\bf D}+{\bf L}+{\bf U}$, where

$\displaystyle {\bf D}=\left[\begin{array}{cccc}a_{11}& 0 & \cdots & 0\\
0 & a_...
...} \\ \vdots & \vdots & \ddots & \vdots \\
0 & 0 & \cdots & 0\end{array}\right]$ (224)

Now the equation system can be written as

$\displaystyle {\bf A}{\bf x}=({\bf D}+{\bf L}+{\bf U}){\bf x}={\bf b},\;\;\;\;\;$i.e.,$\displaystyle \;\;\;\;\;\;{\bf D}{\bf x}={\bf b}-{\bf L}{\bf x}-{\bf U}{\bf x}$ (225)

Premultiplying ${\bf D}^{-1}$ on both sides, we get

$\displaystyle {\bf x}={\bf D}^{-1}({\bf b}-{\bf L}{\bf x} -{\bf U}{\bf x})
={\bf D}^{-1}({\bf b}-{\bf R}{\bf x})$ (226)

where ${\bf R}={\bf L}+{\bf U}$ contains all the off-diagonal elements of ${\bf A}$, same as defined before. This equation is then converted into an iteration

$\displaystyle {\bf x}^{(n+1)}={\bf D}^{-1}({\bf b}-{\bf L}{\bf x}^{(n)}-{\bf U}{\bf x}^{(n)})$ (227)

or in component form:

$\displaystyle x^{(n+1)}_i=\frac{1}{a_{ii}}\left(b_i-\sum_{j<i} a_{ij} x^{(n)}_j...
...ii}}\left(b_i-\sum_{j\ne i} a_{ij} x^{(n)}_j\right),
\;\;\;\;\;\;(i=1,\cdots,N)$ (228)

Note that the first expression is for the Gauss-Seidel iteration, while the second is for Jacobi iteration. However, there is an essential difference between the two methods. In the Jacobi iteration, all $N$ unknowns $x_i^{(n+1)}$ are updated simultaneously (in parallel) from $x_i^{(n)}$. But in Gauss-Seidel iteration, they are updated differently for ${\bf L}{\bf x}$ ($j<i$) and ${\bf U}{\bf x}$ ($j>i$). Specifically, when computing ${\bf L}{\bf x}$ for $x_i^{(n+1)}$, all $x_j$ ($j<i$) in the summation are already updated, i.e., we can therefore use the most updated $x_j^{(n+1)}$ (instead of $x_j^{(n)}$) in the iteration:

$\displaystyle x^{(n+1)}_i=\frac{1}{a_{ii}}\left(b_i-\sum_{j<i} a_{ij} x^{(n+1)}_j
-\sum_{j>i} a_{ij} x^{(n)}_j\right)$ (229)

It can be shown that if matrix ${\bf A}$ is strictly diagonally dominant then the Gauss-Seidel method converges.

The successive over relaxation (SOR) is a method that can be used to speed up the convergence of the iteration. Multiplying a parameter $\omega$ on both sides of the equation ${\bf D}{\bf x}={\bf b}-{\bf L}{\bf x}-{\bf U}{\bf x}$, we get

$\displaystyle \omega{\bf D}{\bf x}=\omega({\bf b}-{\bf L}{\bf x}-{\bf U}{\bf x})$ (230)

which can be written as

$\displaystyle {\bf D}{\bf x}={\bf Dx}-\omega{\bf Dx}+\omega({\bf b}-{\bf L}{\bf...
...\bf x})
=(1-\omega){\bf D}{\bf x}+\omega({\bf b}-{\bf L}{\bf x}-{\bf U}{\bf x})$ (231)

Premultiplying ${\bf D}^{=1}$ we get

$\displaystyle {\bf x}=(1-\omega){\bf x}
+\omega{\bf D}^{-1}({\bf b}-{\bf L}{\bf x}-{\bf U}{\bf x})$ (232)

which can be converted into an iteration:

$\displaystyle x_i^{(n+1)}=(1-\omega)x_i^{(n)}+\frac{\omega}{a_{ii}}
\left(b_i-\...
..._{ij}x_j^{(n+1)}-\sum_{j>i}a_{ij}x_j^{(n)}\right),
\;\;\;\;\;\;\;(i=1,\cdots,N)$ (233)

The right-hand side can be considered as the weighted average of two terms: the estimate from the previous iteration $x_i^{(n)}$ in the first term and the updated estimate in the second term. By adjusting the parameter $\omega$, we can control the rate of convergence. When $\omega=1$, this becomes the same as the Gauss-Seidel iteration. But an $\omega>1$ will make the second term more dominant and speed up an slow converging iteration, while an $\omega<1$ will make the first term more dominant thereby helping establish convergence of an otherwise diverging (e.g., overshooting) iteration.

Homework 4:

  1. Write a function to implement the QR decomposition by Householder transformation.
  2. Use the QR function developed above to solve the following system:

    $\displaystyle \left\{ \begin{array}{c}6x_1-3x_2+4x_3=-17\\ -2x_1-5x_2+3x_3=-8
\\ -x_1+2x_2-4x_3=4\end{array}\right.$ (234)

  3. Write a function to implement the Jacobi iteration method, and then solve the same system above, using ${\bf x}_0=[1,\,1,\,1]^T$ as the initial guess. Print the error $e=\vert\vert{\bf A}{\bf x}-{\bf b}\vert\vert$ after each iteration. Terminate the iteration when $e<10^{-6}$.
  4. Write a function to implement the Gauss-Seidel iteration method, and then solve the same system above, using ${\bf x}_0=[1,\,1,\,1]^T$ as the initial guess. Print the error $e=\vert\vert{\bf A}{\bf x}-{\bf b}\vert\vert$ after each iteration. Terminate the iteration when $e<10^{-6}$. Compare the number of iterations of both Jacobi and Gauss-Seidel methods.
  5. Implement SOR in the Gauss-Seidel method and experiment with different values of $\omega$ to see its effects, and identify a value that minimizes the number of iterations.