next up previous
Next: Solving over-determined linear equations Up: algebra Previous: Matrix norms

Vector and matrix differentiation

A vector differentiation operator is defined as

\begin{displaymath}\frac{d}{d{\bf x}}\stackrel{\triangle}{=}
\left[ \frac{\part...
...\partial x_1},\cdots, \frac{\partial}{\partial x_n} \right]^T
\end{displaymath}

which can be applied to any scalar function $f({\bf x})$ to find its derivative with respect to ${\bf x}$:

\begin{displaymath}\frac{d}{d{\bf x}} f({\bf x}) =
\left[ \frac{\partial f}{\partial x_1},\cdots, \frac{\partial f}{\partial x_n} \right]^T
\end{displaymath}

Vector differentiation has the following properties:

\begin{displaymath}
\frac{d}{d{\bf x}}({\bf b}^T{\bf x})=\frac{d}{d{\bf x}}({\bf x}^T{\bf b})={\bf b}
\end{displaymath}


\begin{displaymath}
\frac{d}{d{\bf x}}({\bf x}^T{\bf x})=2{\bf x}
\end{displaymath}


\begin{displaymath}
\frac{d}{d{\bf x}}({\bf x}^T{\bf A}{\bf x})=({\bf A}^T+{\bf A}){\bf x}
\end{displaymath}

To prove the third one, consider the $k$th element of the vector:

\begin{displaymath}
\frac{\partial}{\partial x_k} ({\bf x}^{T}{\bf A}{\bf x})
...
...k}x_i+\sum_{j=1}^n a_{kj}x_j
\;\;\;\;\;\;\;\;(k=1, \cdots, n)
\end{displaymath}

Putting all $n$ elements in vector form, we have the above. If ${\bf A}^T={\bf A}$ is symmetric, then we have

\begin{displaymath}
\frac{d}{d{\bf x}}({\bf x}^T{\bf A}{\bf x})=2{\bf A}{\bf x}
\end{displaymath}

In particular, when ${\bf A}={\bf I}$, we have

\begin{displaymath}
\frac{d}{d{\bf x}}({\bf x}^T{\bf x})=2{\bf x}
\end{displaymath}

You can compare these results with the familiar derivatives in the scalar case:

\begin{displaymath}\frac{d}{dx}(ax^2)=2ax \end{displaymath}

A matrix differentiation operator is defined as

\begin{displaymath}\frac{d}{d{\bf A}}\stackrel{\triangle}{=}\left[ \begin{array}...
... & ... & \frac{\partial}{\partial a_{mn}}
\end{array} \right] \end{displaymath}

which can be applied to any scalar function $f({\bf A})$:

\begin{displaymath}\frac{d}{d{\bf A}}f({\bf A})=\left[ \begin{array}{ccc}
\frac...
...frac{\partial f({\bf A})}{\partial a_{mn}}
\end{array} \right] \end{displaymath}

Specifically, consider $f({\bf A})={\bf u}^T {\bf A} {\bf v}$, where ${\bf u}$ and ${\bf v}$ are $m\times 1$ and $n\times 1$ constant vectors, respectively, and ${\bf A}$ is an $m\times n$ matrix. Then we have:

\begin{displaymath}\frac{d}{d{\bf A}} ({\bf u}^T {\bf A} {\bf v}) = {\bf u} {\bf v}^T \end{displaymath}


next up previous
Next: Solving over-determined linear equations Up: algebra Previous: Matrix norms
Ruye Wang 2015-04-27