Marginal and conditional distributions of multivariate normal distribution

Next: Appendix B: Kernels and Up: Appendix A: Conditional and Previous: Inverse and determinant of

Marginal and conditional distributions of multivariate normal distribution

Assume an n-dimensional random vector

$\begin{displaymath}{\bf x}=\left[\begin{array}{c}{\bf x}_1 {\bf x}_2\end{array}\right] \end{displaymath}$

has a normal distribution $N({\bf x},\mu,\Sigma)$ with

$\begin{displaymath}\mu=\left[\begin{array}{c}\mu_1 \mu_2\end{array}\right]\;\... ...1} & \Sigma_{12}\\ \Sigma_{21}&\Sigma_{22}\end{array}\right] \end{displaymath}$

where ${\bf x}_1$ and ${\bf x}_2$ are two subvectors of respective dimensions

and

with

. Note that $\Sigma=\Sigma^T$ , and $\Sigma_{21}=\Sigma_{21}^T$ .

Theorem 4:

Part a The marginal distributions of ${\bf x}_1$ and ${\bf x}_2$ are also normal with mean vector $\mu_i$ and covariance matrix $\Sigma_{ii}$ (), respectively.

Part b The conditional distribution of ${\bf x}_i$ given ${\bf x}_j$ is also normal with mean vector

$\begin{displaymath}\mu_{i\vert j}=\mu_i+\Sigma_{ij}\Sigma_{jj}^{-1}({\bf x}_j-\mu_j) \end{displaymath}$

and covariance matrix

$\begin{displaymath}\Sigma_{i\vert j}=\Sigma_{jj}-\Sigma_{ij}^T\Sigma_{ii}^{-1}\Sigma_{ij} \end{displaymath}$

Proof: The joint density of ${\bf x}$ is:

$\begin{displaymath}f({\bf x})=f({\bf x}_1,{\bf x}_2)=\frac{1}{(2\pi)^{n/2\vert\S... ...vert\Sigma\vert^{1/2}}}exp[-\frac{1}{2}Q({\bf x}_1,{\bf x}_2)] \end{displaymath}$

where

is defined as

$\displaystyle Q({\bf x}_1,{\bf x}_2)$	$\textstyle =$	$\displaystyle ({\bf x}-\mu)^T\Sigma^{-1}({\bf x}-\mu)$
	$\textstyle =$	$\displaystyle [({\bf x}_1-\mu_1)^T, ({\bf x}-\mu_2)^T] \left[\begin{array}{cc}\... ...ght] \left[\begin{array}{c}{\bf x}_1-\mu_1 {\bf x}_2-\mu_2\end{array}\right]$
	$\textstyle =$	$\displaystyle ({\bf x}_1-\mu_1)^T\Sigma^{11}({\bf x}_1-\mu_1)+ 2({\bf x}_1-\mu_1)^T\Sigma^{12}({\bf x}_2-\mu_2)+ ({\bf x}_2-\mu_2)^T\Sigma^{22}({\bf x}_2-\mu_2)$

Here we have assumed

$\begin{displaymath}\Sigma^{-1}=\left[\begin{array}{cc}\Sigma_{11} & \Sigma_{12}\... ...} & \Sigma^{12}\\ \Sigma^{21}&\Sigma^{22}\end{array}\right] \end{displaymath}$

According to theorem 2, we have

$\begin{displaymath}\Sigma^{11}=(\Sigma_{11}-\Sigma_{12}\Sigma_{22}^{-1}\Sigma_{1... ...\Sigma_{11}^{-1}\Sigma_{12})^{-1}\Sigma_{12}^T\Sigma_{11}^{-1} \end{displaymath}$

$\begin{displaymath}\Sigma^{22}=(\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_... ...\Sigma_{22}^{-1}\Sigma_{12}^T)^{-1}\Sigma_{12}\Sigma_{22}^{-1} \end{displaymath}$

$\begin{displaymath}\Sigma^{12}=-\Sigma_{11}^{-1}\Sigma_{12}(\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_{12})^{-1}=(\Sigma^{21})^T \end{displaymath}$

Substituting the second expression for $\Sigma^{11}$ , first expression for $\Sigma^{22}$ , and $\Sigma^{12}$ into $Q({\bf x}_1,{\bf x}_2)$ to get:

$\displaystyle Q({\bf x}_1,{\bf x}_2)$	$\textstyle =$	$\displaystyle ({\bf x}_1-\mu_1)^T [\Sigma_{11}^{-1}+\Sigma_{11}^{-1}\Sigma_{12}... ...igma_{11}^{-1}\Sigma_{12})^{-1}\Sigma_{12}^T\Sigma_{11}^{-1}] ({\bf x}_1-\mu_1)$
		$\displaystyle -2({\bf x}_1-\mu_1)^T [\Sigma_{11}^{-1}\Sigma_{12}(\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_{12})^{-1}]({\bf x}_2-\mu_2)$
		$\displaystyle +({\bf x}_2-\mu_2)^T [(\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_{12})^{-1}] ({\bf x}_2-\mu_2)$
	$\textstyle =$	$\displaystyle ({\bf x}_1-\mu_1)^T \Sigma_{11}^{-1} ({\bf x}_1-\mu_1)$
		$\displaystyle +({\bf x}_1-\mu_1)^T \Sigma_{11}^{-1}\Sigma_{12}(\Sigma_{22}-A^T_... ...igma_{11}^{-1}\Sigma_{12})^{-1}\Sigma_{12}^T\Sigma_{11}^{-1}] ({\bf x}_1-\mu_1)$
		$\displaystyle -2({\bf x}_1-\mu_1)^T [\Sigma_{11}^{-1}\Sigma_{12}(\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_{12})^{-1}]({\bf x}_2-\mu_2)$
		$\displaystyle +({\bf x}_2-\mu_2)^T [(\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_{12})^{-1}] ({\bf x}_2-\mu_2)$
	$\textstyle =$	$\displaystyle ({\bf x}_1-\mu_1)^T \Sigma_{11}^{-1} ({\bf x}_1-\mu_1)$
		$\displaystyle +[({\bf x}_2-\mu_2)-\Sigma_{12}^T\Sigma_{11}^{-1}({\bf x}_1-\mu_1... ...a_{12})^{-1} [({\bf x}_2-\mu_2)-\Sigma_{12}^T\Sigma_{11}^{-1}({\bf x}_1-\mu_1)]$

The last equal sign is due to the following equations for any vectors

and

and a symmetric matrix

		$\displaystyle u^TAu-2u^TAv+v^TAv=u^T Au-u^TAv-u^TAv+v^TAv$
	$\textstyle =$	$\displaystyle u^TA(u-v)-(u-v)^TAv=u^TA(u-v)-v^TA(u-v)$
	$\textstyle =$	$\displaystyle (u-v)^TA(u-v)=(v-u)^TA(v-u)$

We define

$\begin{displaymath}b\stackrel{\triangle}{=}\mu_2+\Sigma_{12}^T\Sigma_{11}^{-1}({\bf x}_1-\mu_1) \end{displaymath}$

$\begin{displaymath}A\stackrel{\triangle}{=}\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_{12} \end{displaymath}$

and

$\begin{displaymath}\left\{ \begin{array}{ll} Q_1({\bf x}_1)&\stackrel{\triangle... ...\ &=({\bf x}_2-b)^T A^{-1}({\bf x}_2-b) \end{array} \right. \end{displaymath}$

and get

$\begin{displaymath}Q({\bf x}_1,{\bf x}_2)=Q_1({\bf x}_1)+Q_2({\bf x}_1,{\bf x}_2) \end{displaymath}$

Now the joint distribution can be written as:

$\displaystyle f({\bf x})$	$\textstyle =$	$\displaystyle f({\bf x}_1,{\bf x}_2) =\frac{1}{(2\pi)^{n/2\vert\Sigma\vert^{1/2}}}exp[-\frac{1}{2}Q({\bf x}_1,{\bf x}_2)]$
	$\textstyle =$	$\displaystyle \frac{1}{(2\pi)^{n/2}\vert\Sigma_{11}\vert^{1/2} \vert\Sigma_{22}... ...T\Sigma_{11}^{-1}\Sigma_{12}\vert^{1/2}}exp[-\frac{1}{2}Q({\bf x}_1,{\bf x}_2)]$
	$\textstyle =$	$\displaystyle \frac{1}{(2\pi)^{p/2}\vert\Sigma_{11}\vert^{1/2}}exp[-\frac{1}{2}... ...i)^{q/2}\vert A\vert^{1/2}}exp[-\frac{1}{2}({\bf x}_2-b)^T A^{-1}({\bf x}_2-b)]$
	$\textstyle =$	$\displaystyle N({\bf x}_1,\mu_1,\Sigma_{11})\;N({\bf x}_2,b,A)$

The third equal sign is due to theorem 3:

$\begin{displaymath}\vert\Sigma\vert=\vert\Sigma_{11}\vert \vert\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_{12}\vert\end{displaymath}$

The marginal distribution of ${\bf x}_1$ is

$\begin{displaymath}f_1({\bf x}_1)=\int f({\bf x}_1,{\bf x}_2) d{\bf x_2} =\frac{... ...ac{1}{2} ({\bf x}_1-\mu_1)^T\Sigma_{11}^{-1}({\bf x}_1-\mu_1)] \end{displaymath}$

and the conditional distribution of ${\bf x}_2$ given ${\bf x}_1$ is

$\begin{displaymath}f_{2\vert 1}({\bf x}_2\vert{\bf x}_1)=\frac{f({\bf x}_1,{\bf ... ...ert^{1/2}}exp[-\frac{1}{2}({\bf x}_2-b)^T A^{-1}({\bf x}_2-b)] \end{displaymath}$

with

$\begin{displaymath}b=\mu_2+\Sigma_{12}^T\Sigma_{11}^{-1}({\bf x}_1-\mu_1) \end{displaymath}$

$\begin{displaymath}A=\Sigma_{22}-\Sigma_{12}^T\Sigma_{11}^{-1}\Sigma_{12} \end{displaymath}$

Next: Appendix B: Kernels and Up: Appendix A: Conditional and Previous: Inverse and determinant of

Ruye Wang 2006-11-14