Next: Karhunen-Loeve Transform (KLT) Up: klt Previous: Multivariate Random Signals

Covariance and Correlation

Let and be two real random variables in a random vector ${\bf x}=[x_1,\cdots,x_N]^T$ . The mean and variance of a variable and the covariance and correlation coefficient (normalized correlation) between two variables and are defined below:

Mean of : $\mu_i=E(x_i)$
Variance of : $\sigma^2_i=E[ (x_i-\mu_i)^2 ]=E(x_i^2)-m_i^2$
Covariance of and : $\sigma^2_{ij}=E[ (x_i-\mu_i)(x_j-\mu_j) ]=E(x_ix_j)-\mu_i\mu_j$
Correlation coefficient between and : $r_{ij}=\sigma_{ij}^2 / \sqrt{ \sigma_i^2\;\sigma_j^2} =\sigma_{ij}^2 / \sigma_i\;\sigma_j$

Note that the correlation coefficient $r_{ij}=\sigma^2_{i}/\sigma_i\sigma_j$ can be considered as the normalized covariance $\sigma^2_{ij}$ .

To obtain these parameters as expectations of the first and second order functions of the random variables, the joint probability density function $p(x_1,\cdots,x_N)$ is required. However, when it is not available, the parameters can still be estimated by averaging the outcomes of a random experiment involving these variables repeated times:

$\begin{displaymath}\hat{\mu}_i=\frac{1}{K}\sum_{k=1}^K x_i^{(k)} \end{displaymath}$

$\begin{displaymath} \hat{\sigma^2_i}=\frac{1}{K}\sum_{k=1}^K (x_i^{(k)}-\hat{\mu}_i)^2 =\frac{1}{K}\sum_{k=1}^K x_i^{(k)}^2-\hat{\mu}_i^2 \end{displaymath}$

$\begin{displaymath} \hat{\sigma^2_{ij}}=\frac{1}{K}\sum_{k=1}^K (x_i^{(k)}-\hat{... ...c{1}{K}\sum_{k=1}^K x_j^{(k)}x_j^{(k)}-\hat{\mu}_i \hat{\mu}_j \end{displaymath}$

$\begin{displaymath} \hat{r}_{ij}=\frac{\hat{\sigma^2_{ij}}}{\sqrt{ \hat{\sigma_i... ...=\frac{\hat{\sigma^2_{ij}}}{\hat{\sigma_i}\; \hat{\sigma_j} } \end{displaymath}$

To understand intuitively the meaning of these parameters, we consider the following very simple examples.

Examples:

Assume the experiment concerning and is repeated times with the following outcomes:

$\begin{displaymath} \begin{tabular}{c\vert lll}\hline Experiment & 1st & 2nd & 3... ... 1 & 2 & 3 \\ $x_j^{(k)}$ & 1 & 2 & 3 \hline \end{tabular}\end{displaymath}$

The means, variances and covariance of and can be estimated as

$\begin{displaymath}\hat{\mu}_i=\frac{1}{K}\sum_{k=1}^Kx_i^{(k)} =\hat{\mu}_j=\frac{1}{K}\sum_{k=1}^Kx_j^{(k)}=2 \end{displaymath}$

$\begin{displaymath} \hat{\sigma^2_i}=\frac{1}{K}\sum_{k=1}^K(x_i^{(k)}-\hat{\mu... ...1+2 \times 2+3 \times 3)/3-2 \times 2=0.667=\hat{\sigma^2_j} \end{displaymath}$

$\begin{displaymath} \hat{\sigma^2_{ij}}=\frac{1}{K}\sum_{k=1}^K(x_i^{(k)}-\hat{... ...u}_j =(1 \times 1+2 \times 2+3 \times 3)/3-2 \times 2=0.667 \end{displaymath}$

and the correlation coefficient is:

$\begin{displaymath}\hat{r}_{ij}=\frac{\hat{\sigma^2_{ij}}}{\sqrt{\hat{\sigma^2_i} \hat{\sigma^2_j}}}=1 \end{displaymath}$

We see that and are highly (maximally in this case) correlated.
Assume the outcomes of the 3 experiments are

$\begin{displaymath}\begin{tabular}{c\vert lll}\hline Experiment & 1st & 2nd & 3... ... 2 & 4 & 6 \\ $x_j^{(k)}$ & 3 & 6 & 9 \hline \end{tabular} \end{displaymath}$

then we have

$\begin{displaymath}\hat{\mu}_i=4,\;\;\hat{\mu}_j=6,\;\;\hat{\sigma^2_i}=8/3,\;\;\hat{\sigma^2_j}=6, \;\;\hat{\sigma^2_{ij}}=4 \end{displaymath}$

and

$\begin{displaymath}\hat{r}_{ij}=\frac{\hat{\sigma^2_{ij}}}{\sqrt{\hat{\sigma^2_i} \hat{\sigma^2_j}}} =\frac{4}{\sqrt{6\times 8/3}}=\frac{4}{4}=1 \end{displaymath}$

We see that the two variables and can be individually scaled while their correlation remains the same.
Assume the outcomes of the 3 experiments are

$\begin{displaymath}\begin{tabular}{c\vert lll}\hline Experiment& 1st & 2nd & 3rd... ... 1 & 2 & 3 \\ $x_j^{(k)}$ & 3 & 2 & 1 \hline \end{tabular} \end{displaymath}$

We have $\hat{\mu}_i=\hat{\mu}_j=2$ and

$\begin{displaymath} \hat{\sigma^2_{ij}}=\frac{1}{K}\sum_{k=1}^K(x_i^{(k)}-\hat{\... ...mu}_j =(1 \times 3+2 \times 2+3 \times 1)/3-2 \times 2=-0.667 \end{displaymath}$

$\begin{displaymath}\hat{\sigma}^2_x=\hat{\sigma}^2_y=0.667 \end{displaymath}$

And the correlation coefficient is:

$\begin{displaymath}r_{ij}=\frac{\sigma_{ij}^2}{\sigma_x \cdot \sigma_y} =\frac{-0.667}{\sqrt{0.667}\sqrt{0.667}}=-1 \end{displaymath}$

indicating that the two variables are highly inversely correlated.
Assume the outcomes are:

$\begin{displaymath}\begin{tabular}{c\vert lll}\hline Experiment& 1st & 2nd & 3rd... ...x_i$ & 1 & 2 & 3 \\ $x_j$ & 2 & 2 & 2 \hline \end{tabular} \end{displaymath}$

We have $\hat{\mu}_i=\hat{\mu}_j=2$ , $\hat{\sigma_i}=\hat{\sigma_i}=\hat{\sigma_{ij}}=0.667$ and

$\begin{displaymath}\hat{\sigma^2_{ij}}=(1 \times 2+2 \times 2+3 \times 2)/3-2 \times 2=0 \end{displaymath}$

and $r_{ij}=0$ , indicating that the two variables are totally uncorrelated (unrelated).
Assume the experiment is carried out times with the outcomes:

$\begin{displaymath}\begin{tabular}{c\vert lllll}\hline Experiment& 1st & 2nd & 3... ... 3 \\ $x_j^{(k)}$ & 2 & 1 & 2 & 3 & 2 \hline \end{tabular} \end{displaymath}$

We still have

$\begin{displaymath}\hat{\mu}_i=\hat{\mu}_j=2 \end{displaymath}$

$\begin{displaymath}\hat{\sigma^2_{ij}}=(1 \times 2+2 \times 1+2 \times 2+2 \times 3+3 \times 2)/5 -2 \times 2=0 \end{displaymath}$

and $r_{ij}=0$ , indicating that the two variables are totally uncorrelated (unrelated).

Now we see that the covariance $\sigma_{ij}^2$ represents how much the two ramdom variables

and

are positively correlated if $\sigma_{ij}^2>0$ , negatively correlated if $\sigma_{ij}^2<0$ , or not correlated at all if $\sigma_{ij}^2=0$ .

Assume a random vector ${\bf x}=[x_1,\cdots,x_N]^T$ is composed of samples of a signal . The signal samples close to each other tend to be more correlated than those that are farther away, i.e., given , we can predict the next sample $x_{i+1}$ with much higher confidence than predicting some which is farther away. Consequently, the elements in the covariance matrix ${\bf\Sigma}_x$ near the main diagonal have higher values than those farther away from the diagonal.

Next: Karhunen-Loeve Transform (KLT) Up: klt Previous: Multivariate Random Signals

Ruye Wang 2016-04-06