next up previous
Next: Spatiotemporal energy based motion Up: The Models Previous: Correlation based motion detection

Gradient based motion detection

The visual signal form on the retina can be treated as a two-dimensional time-varying brightness function I(x(t), y(t), t). If this function is moving with a local velocity ${\bf v}=(u(x,y),v(x,y))$, where u and v are the velocity components in the x and y directions, respectively, then the brightness at point $[x(t+\delta t),y(t+\delta t)]=[x+u \delta t, y+v \delta t]$ at time $t+\delta t$ is approximately the same as the brightness at point [x,y] at time t (assuming the luminance of the 3D point has changed very little by the displacement due to motion), i.e.,

\begin{displaymath}I(x(t+\delta t),y(t+\delta t),t+\delta t)=
I(x+u\delta t, y+v\delta t, t+\delta t)=I(x,y,t) \end{displaymath}

In order to estimate the velocity u(x,y) and v(x,y), the left-hand side of this equation can be expanded into Taylor series with respect to time and the equation becomes:

\begin{displaymath}I(x,y,t)+\frac{d}{dt} I(x(t),y(t),t) \;\delta t + \epsilon =I(x,y,t) \end{displaymath}

where $\epsilon$ represents the second and higher terms of $\delta t$ in the expansion. When $\delta$ approaches 0, this error $\epsilon$ quickly disappears and the optic flow constraint equation can be obtained:

\begin{displaymath}\frac{d}{dt}I(x(t),y(t),t)=\frac{\partial{I}}{\partial{x}} \f...
...
=I_x u + I_y v + I_t =\bigtriangledown I \cdot {\bf v}+I_t=0
\end{displaymath}

where Ix, Iy, and It represent the partial derivatives of I(x,y,t)with respect to variables x, y and t, respectively, ${\bf v}$ is the 2D velocity

\begin{displaymath}{\bf v}=(\frac{dx}{dt}, \frac{dy}{dt}) \end{displaymath}

and $\bigtriangledown I$ is the gradient (a vector) of I(x,y):

\begin{displaymath}\bigtriangledown I=(\frac{\partial}{\partial x}, \frac{\parti...
...ial I}{\partial x}, \frac{\partial I}{\partial y})
=(I_x,I_y)
\end{displaymath}

with $\bigtriangledown$ being the gradient operator:

\begin{displaymath}\bigtriangledown \stackrel{\triangle}{=}
(\frac{\partial}{\partial x}, \frac{\partial}{\partial y})
\end{displaymath}

The above equation, called the optic flow constraint equation, can be rewritten as

\begin{displaymath}I_t=-\bigtriangledown I \cdot {\bf v} \end{displaymath}

which indicates that the rate of temporal change in intensity of the scene is the dot product of its rate of spatial change and its motion velocity. As this equation has two independent unknowns u and v, the problem is ill-posed in the sense that it does not have a unique solution. In order to obtain a unique solution, additional condition needs to be imposed.

This gradient method for detecting motion seems very mathematically involved. However, a network implementation of this method was developed in [16] which is more biologically plausible.

In discrete case all spatial functions are represented by 2D arrays and all spatial partial derivatives are replaced by differences between neighboring elements. And the smoothness condition becomes the mininization of the following

\begin{displaymath}s_{i,j} \stackrel{\triangle}{=} (u_x^2 + u_y^2) + (v_x^2 + v_...
...,j}-u_{i,j-1})^2+
(v_{i,j}-v_{i-1,j})^2+(v_{i,j}-v_{i,j-1})^2
\end{displaymath}

and the optic flow constraint requires the minimization of the following

\begin{displaymath}c_{i,j} \stackrel{\triangle}{=} (I_x u_{i,j}+I_y v_{i,j}+I_t)^2
\end{displaymath}

Put the smoothness condition and optic flow constraint together, we want to find uk,l's and vk,l's that minimize


\begin{displaymath}e \stackrel{\triangle}{=} \sum_i \sum_j (s_{i,j}+\lambda c_{i,j})
\end{displaymath}

where $\lambda$ is a parameter for adjusting the relative importance of the two terms. To do so, we set the partial derivatives of e with respect to uk,l and vk,l to zero:

\begin{displaymath}\frac{\partial e}{\partial u_{k,l}}
2(2u_{i,j}-(u_{i-1,j}+u_...
...}-\overline{u_{i,j}})+2\lambda(I_xu_{k,l}+I_yv_{k,l}+I_t)I_x=0
\end{displaymath}

where

\begin{displaymath}\overline{u_{i,j}}\stackrel{\triangle}{=}(u_{i-1,j}+u_{i,j-1})/2
\end{displaymath}

can be considered as the local average of ui,j. Note that the constant coefficients 2 and 4 can be dropped due to the arbitrary $\lambda$, and the second equation above can be treated in exactly the same way. Now we can get

\begin{displaymath}(1+\lambda I_x^2)u_{k,l}+\lambda I_xI_yv_{k,l}=\overline{u_{k,l}}-\lambda I_xI_t
\end{displaymath}


\begin{displaymath}\lambda I_xI_yu_{k,l}+(1+\lambda I_x^2)v_{k,l}=\overline{v_{k,l}}-\lambda I_yI_t
\end{displaymath}

This linear equation system of two equations and two unknowns can be solved to get

\begin{displaymath}u_{k,l}=\overline{u_{k,l}}-I_x
\frac{\lambda I_x\overline{u_{...
... I_y\overline{v_{k,l}}+\lambda I_t}
{(1+\lambda(I_x^2+I_y^2))} \end{displaymath}


\begin{displaymath}v_{k,l}=\overline{v_{k,l}}-I_y
\frac{\lambda I_x\overline{u_{...
... I_y\overline{v_{k,l}}+\lambda I_t}
{(1+\lambda(I_x^2+I_y^2))} \end{displaymath}

This can be readily implemented by an iterative algorithm (Gauss-Seidel method):

\begin{displaymath}u_{k,l}^{n+1}=\overline{u_{k,l}^n}-I_x D \end{displaymath}


\begin{displaymath}v_{k,l}^{n+1}=\overline{v_{k,l}^n}-I_y D \end{displaymath}

where

\begin{displaymath}D \stackrel{\triangle}{=}
\frac{\lambda I_x\overline{u_{k,l}...
..._y\overline{v_{k,l}^n}+\lambda I_t}
{(1+\lambda(I_x^2+I_y^2))}
\end{displaymath}

There are two major problems associated with this gradient based method. First, the method relies on the image gradient $\bigtriangledown I(x,y)=(I_x, I_y)$ assumed to be available at every location of the scene. But in fact gradient is not available (zero gradient) at all homogeneous regions of the image. A possible solution is to interpolate from gradients of neighboring regions. Second, the smoothness condition assumes local motion velocities are similar to each other, therefore causing inaccuracy at all motion discontinuities (boundaries of moving objects). One solution is to only smooth the motion velocities along the tangents of the image discontinuities ( boundaries) but not accross (gradient).


next up previous
Next: Spatiotemporal energy based motion Up: The Models Previous: Correlation based motion detection
Ruye Wang
2000-04-25