next up previous
Next: Appendix B: Kernels and Up: Appendix A: Conditional and Previous: Inverse and determinant of

Marginal and conditional distributions of multivariate normal distribution

Assume an n-dimensional random vector

\begin{displaymath}{\bf x}=\left[\begin{array}{c}{\bf x}_1 {\bf x}_2\end{array}\right] \end{displaymath}

has a normal distribution $N({\bf x},\mu,\Sigma)$ with

\begin{displaymath}\mu=\left[\begin{array}{c}\mu_1  \mu_2\end{array}\right]\;\...
...1} & \Sigma_{12}\\
\Sigma_{21}&\Sigma_{22}\end{array}\right] \end{displaymath}

where ${\bf x}_1$ and ${\bf x}_2$ are two subvectors of respective dimensions $p$ and $q$ with $p+q=n$. Note that $\Sigma=\Sigma^T$, and $\Sigma_{21}=\Sigma_{21}^T$.

Theorem 4:

Part a The marginal distributions of ${\bf x}_1$ and ${\bf x}_2$ are also normal with mean vector $\mu_i$ and covariance matrix $\Sigma_{ii}$ ($i=1,2$), respectively.

Part b The conditional distribution of ${\bf x}_i$ given ${\bf x}_j$ is also normal with mean vector

\begin{displaymath}\mu_{i\vert j}=\mu_i+\Sigma_{ij}\Sigma_{jj}^{-1}({\bf x}_j-\mu_j) \end{displaymath}

and covariance matrix

\begin{displaymath}\Sigma_{i\vert j}=\Sigma_{jj}-\Sigma_{ij}^T\Sigma_{ii}^{-1}\Sigma_{ij} \end{displaymath}

Proof: The joint density of ${\bf x}$ is:

\begin{displaymath}f({\bf x})=f({\bf x}_1,{\bf x}_2)=\frac{1}{(2\pi)^{n/2\vert\S...
...vert\Sigma\vert^{1/2}}}exp[-\frac{1}{2}Q({\bf x}_1,{\bf x}_2)] \end{displaymath}

where $Q$ is defined as
$\displaystyle Q({\bf x}_1,{\bf x}_2)$ $\textstyle =$ $\displaystyle ({\bf x}-\mu)^T\Sigma^{-1}({\bf x}-\mu)$  
  $\textstyle =$ $\displaystyle [({\bf x}_1-\mu_1)^T, ({\bf x}-\mu_2)^T]
\left[\begin{array}{cc}\...
...ght]
\left[\begin{array}{c}{\bf x}_1-\mu_1  {\bf x}_2-\mu_2\end{array}\right]$  
  $\textstyle =$ $\displaystyle ({\bf x}_1-\mu_1)^T\Sigma^{11}({\bf x}_1-\mu_1)+
2({\bf x}_1-\mu_1)^T\Sigma^{12}({\bf x}_2-\mu_2)+
({\bf x}_2-\mu_2)^T\Sigma^{22}({\bf x}_2-\mu_2)$  

Here we have assumed

\begin{displaymath}\Sigma^{-1}=\left[\begin{array}{cc}\Sigma_{11} & \Sigma_{12}\...
...} & \Sigma^{12}\\
\Sigma^{21}&\Sigma^{22}\end{array}\right]
\end{displaymath}

According to theorem 2, we have

\begin{displaymath}\Sigma^{11}=(\Sigma_{11}-\Sigma_{12}\Sigma_{22}^{-1}\Sigma_{1...
...\Sigma_{11}^{-1}\Sigma_{12})^{-1}\Sigma_{12}^T\Sigma_{11}^{-1} \end{displaymath}


\begin{displaymath}\Sigma^{22}=(\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_...
...\Sigma_{22}^{-1}\Sigma_{12}^T)^{-1}\Sigma_{12}\Sigma_{22}^{-1} \end{displaymath}


\begin{displaymath}\Sigma^{12}=-\Sigma_{11}^{-1}\Sigma_{12}(\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_{12})^{-1}=(\Sigma^{21})^T \end{displaymath}

Substituting the second expression for $\Sigma^{11}$, first expression for $\Sigma^{22}$, and $\Sigma^{12}$ into $Q({\bf x}_1,{\bf x}_2)$ to get:
$\displaystyle Q({\bf x}_1,{\bf x}_2)$ $\textstyle =$ $\displaystyle ({\bf x}_1-\mu_1)^T
[\Sigma_{11}^{-1}+\Sigma_{11}^{-1}\Sigma_{12}...
...igma_{11}^{-1}\Sigma_{12})^{-1}\Sigma_{12}^T\Sigma_{11}^{-1}]
({\bf x}_1-\mu_1)$  
    $\displaystyle -2({\bf x}_1-\mu_1)^T
[\Sigma_{11}^{-1}\Sigma_{12}(\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_{12})^{-1}]({\bf x}_2-\mu_2)$  
    $\displaystyle +({\bf x}_2-\mu_2)^T
[(\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_{12})^{-1}]
({\bf x}_2-\mu_2)$  
  $\textstyle =$ $\displaystyle ({\bf x}_1-\mu_1)^T \Sigma_{11}^{-1} ({\bf x}_1-\mu_1)$  
    $\displaystyle +({\bf x}_1-\mu_1)^T
\Sigma_{11}^{-1}\Sigma_{12}(\Sigma_{22}-A^T_...
...igma_{11}^{-1}\Sigma_{12})^{-1}\Sigma_{12}^T\Sigma_{11}^{-1}]
({\bf x}_1-\mu_1)$  
    $\displaystyle -2({\bf x}_1-\mu_1)^T
[\Sigma_{11}^{-1}\Sigma_{12}(\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_{12})^{-1}]({\bf x}_2-\mu_2)$  
    $\displaystyle +({\bf x}_2-\mu_2)^T
[(\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_{12})^{-1}]
({\bf x}_2-\mu_2)$  
  $\textstyle =$ $\displaystyle ({\bf x}_1-\mu_1)^T \Sigma_{11}^{-1} ({\bf x}_1-\mu_1)$  
    $\displaystyle +[({\bf x}_2-\mu_2)-\Sigma_{12}^T\Sigma_{11}^{-1}({\bf x}_1-\mu_1...
...a_{12})^{-1}
[({\bf x}_2-\mu_2)-\Sigma_{12}^T\Sigma_{11}^{-1}({\bf x}_1-\mu_1)]$  

The last equal sign is due to the following equations for any vectors $u$ and $v$ and a symmetric matrix $A=A^T$:
    $\displaystyle u^TAu-2u^TAv+v^TAv=u^T Au-u^TAv-u^TAv+v^TAv$  
  $\textstyle =$ $\displaystyle u^TA(u-v)-(u-v)^TAv=u^TA(u-v)-v^TA(u-v)$  
  $\textstyle =$ $\displaystyle (u-v)^TA(u-v)=(v-u)^TA(v-u)$  

We define

\begin{displaymath}b\stackrel{\triangle}{=}\mu_2+\Sigma_{12}^T\Sigma_{11}^{-1}({\bf x}_1-\mu_1) \end{displaymath}


\begin{displaymath}A\stackrel{\triangle}{=}\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_{12} \end{displaymath}

and

\begin{displaymath}\left\{ \begin{array}{ll}
Q_1({\bf x}_1)&\stackrel{\triangle...
...\
&=({\bf x}_2-b)^T A^{-1}({\bf x}_2-b)
\end{array} \right.
\end{displaymath}

and get

\begin{displaymath}Q({\bf x}_1,{\bf x}_2)=Q_1({\bf x}_1)+Q_2({\bf x}_1,{\bf x}_2) \end{displaymath}

Now the joint distribution can be written as:
$\displaystyle f({\bf x})$ $\textstyle =$ $\displaystyle f({\bf x}_1,{\bf x}_2)
=\frac{1}{(2\pi)^{n/2\vert\Sigma\vert^{1/2}}}exp[-\frac{1}{2}Q({\bf x}_1,{\bf x}_2)]$  
  $\textstyle =$ $\displaystyle \frac{1}{(2\pi)^{n/2}\vert\Sigma_{11}\vert^{1/2} \vert\Sigma_{22}...
...T\Sigma_{11}^{-1}\Sigma_{12}\vert^{1/2}}exp[-\frac{1}{2}Q({\bf x}_1,{\bf x}_2)]$  
  $\textstyle =$ $\displaystyle \frac{1}{(2\pi)^{p/2}\vert\Sigma_{11}\vert^{1/2}}exp[-\frac{1}{2}...
...i)^{q/2}\vert A\vert^{1/2}}exp[-\frac{1}{2}({\bf x}_2-b)^T A^{-1}({\bf x}_2-b)]$  
  $\textstyle =$ $\displaystyle N({\bf x}_1,\mu_1,\Sigma_{11})\;N({\bf x}_2,b,A)$  

The third equal sign is due to theorem 3:

\begin{displaymath}\vert\Sigma\vert=\vert\Sigma_{11}\vert \vert\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_{12}\vert\end{displaymath}

The marginal distribution of ${\bf x}_1$ is

\begin{displaymath}f_1({\bf x}_1)=\int f({\bf x}_1,{\bf x}_2) d{\bf x_2}
=\frac{...
...ac{1}{2}
({\bf x}_1-\mu_1)^T\Sigma_{11}^{-1}({\bf x}_1-\mu_1)]
\end{displaymath}

and the conditional distribution of ${\bf x}_2$ given ${\bf x}_1$ is

\begin{displaymath}f_{2\vert 1}({\bf x}_2\vert{\bf x}_1)=\frac{f({\bf x}_1,{\bf ...
...ert^{1/2}}exp[-\frac{1}{2}({\bf x}_2-b)^T A^{-1}({\bf x}_2-b)]
\end{displaymath}

with

\begin{displaymath}b=\mu_2+\Sigma_{12}^T\Sigma_{11}^{-1}({\bf x}_1-\mu_1) \end{displaymath}


\begin{displaymath}A=\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_{12} \end{displaymath}


next up previous
Next: Appendix B: Kernels and Up: Appendix A: Conditional and Previous: Inverse and determinant of
Ruye Wang 2006-11-14