Gradient based motion detection

Next: Spatiotemporal energy based motion Up: The Models Previous: Correlation based motion detection

Gradient based motion detection

The visual signal form on the retina can be treated as a two-dimensional time-varying brightness function I(x(t), y(t), t). If this function is moving with a local velocity ${\bf v}=(u(x,y),v(x,y))$ , where u and v are the velocity components in the x and y directions, respectively, then the brightness at point $[x(t+\delta t),y(t+\delta t)]=[x+u \delta t, y+v \delta t]$ at time $t+\delta t$ is approximately the same as the brightness at point [x,y] at time t (assuming the luminance of the 3D point has changed very little by the displacement due to motion), i.e.,

$\begin{displaymath}I(x(t+\delta t),y(t+\delta t),t+\delta t)= I(x+u\delta t, y+v\delta t, t+\delta t)=I(x,y,t) \end{displaymath}$

In order to estimate the velocity u(x,y) and v(x,y), the left-hand side of this equation can be expanded into Taylor series with respect to time and the equation becomes:

$\begin{displaymath}I(x,y,t)+\frac{d}{dt} I(x(t),y(t),t) \;\delta t + \epsilon =I(x,y,t) \end{displaymath}$

where $\epsilon$ represents the second and higher terms of $\delta t$ in the expansion. When $\delta$ approaches 0, this error $\epsilon$ quickly disappears and the optic flow constraint equation can be obtained:

$\begin{displaymath}\frac{d}{dt}I(x(t),y(t),t)=\frac{\partial{I}}{\partial{x}} \f... ... =I_x u + I_y v + I_t =\bigtriangledown I \cdot {\bf v}+I_t=0 \end{displaymath}$

where I_x, I_y, and I_t represent the partial derivatives of I(x,y,t)with respect to variables x, y and t, respectively, ${\bf v}$ is the 2D velocity

$\begin{displaymath}{\bf v}=(\frac{dx}{dt}, \frac{dy}{dt}) \end{displaymath}$

and $\bigtriangledown I$ is the gradient (a vector) of I(x,y):

$\begin{displaymath}\bigtriangledown I=(\frac{\partial}{\partial x}, \frac{\parti... ...ial I}{\partial x}, \frac{\partial I}{\partial y}) =(I_x,I_y) \end{displaymath}$

with $\bigtriangledown$ being the gradient operator:

$\begin{displaymath}\bigtriangledown \stackrel{\triangle}{=} (\frac{\partial}{\partial x}, \frac{\partial}{\partial y}) \end{displaymath}$

The above equation, called the optic flow constraint equation, can be rewritten as

$\begin{displaymath}I_t=-\bigtriangledown I \cdot {\bf v} \end{displaymath}$

which indicates that the rate of temporal change in intensity of the scene is the dot product of its rate of spatial change and its motion velocity. As this equation has two independent unknowns u and v, the problem is ill-posed in the sense that it does not have a unique solution. In order to obtain a unique solution, additional condition needs to be imposed.

First, the aperture problem can be used as such a condition. Aperture problem is encountered by any visual system (artificial or biological) based on an array of sensors with limited ``aperture'' (receptive field for neurons). Under this constraint, each sensor can only see a small local area (represented by the circles in the figure) and detect the component velocity perpendicular to the most salient line feature inside the aperture, such as a piece of edge, or boundary, etc., instead of the true 2D motion. The normal direction of such an orientational feature can be found as the gradient of the brightness $\bigtriangledown I(x,y)$ (along this direction the brightness changes most quickly), and an additional equation requiring the velocity ${\bf v}=(u,v)$ to be in the same direction as the gradient can be added

$\begin{displaymath}\frac{u}{I_x}=\frac{v}{I_y}\;\;\;\;\;i.e.,\;\;\;\;\;I_y u - I_x v =0 \end{displaymath}$

Now the velocity (u,v) satisfying both this constraining condition and the optic flow constraint equation obtained previously can be uniquely found to be

$\begin{displaymath}\left\{ \begin{array}{l} u=-I_t I_x/(I_x^2+I_y^2) \\ v=-I_t I_y/(I_x^2+I_y^2) \end{array} \right. \end{displaymath}$

The component of optical flow in the direction of the brightness gradient is

$\begin{displaymath}\left\vert {\bf v} \right\vert=\sqrt{u^2+v^2}=\frac{I_t}{\sqrt{I_x^2+I_y^2}} \end{displaymath}$
Alternatively, the ill-posed problem can be solved using a so-called regularization methods that imposes a smoothness condition on the velocity. This method minimizes the error in the optic flow constraint equation over the entire visual field

$\begin{displaymath}\int \int (I_x u+I_y v +I_t)^2 dx dy \longrightarrow min. \end{displaymath}$

under the condition that the velocity (u,v) should be as smooth as possible, i.e., the magnitudes of the velocity gradients (representing local spatial changes of the velocity) should be minimized:

$\begin{displaymath}\int \int [ \left\vert \bigtriangledown u \right\vert^2 + \l... ...2 + u_x^2 ) + (v_x^2 + v_y^2 ) ] dx\;dy \longrightarrow min. \end{displaymath}$

This is a problem in the calculus of variations and the associated Euler-Lagrange equations are

$\begin{displaymath}\left\{ \begin{array}{l} \bigtriangledown^2 u= \lambda (I_xu+... ...ngledown^2 v= \lambda (I_xu+I_yv+I_t) I_y \end{array} \right. \end{displaymath}$

where

$\begin{displaymath}\bigtriangledown^2 \stackrel{\triangle}{=} \bigtriangledown ... ...frac{\partial^2}{\partial x^2}+\frac{\partial^2}{\partial y^2} \end{displaymath}$

is the Laplacian operator. This pair of partial differential equations can be solved numerically using iterative methods.

This gradient method for detecting motion seems very mathematically involved. However, a network implementation of this method was developed in [16] which is more biologically plausible.

In discrete case all spatial functions are represented by 2D arrays and all spatial partial derivatives are replaced by differences between neighboring elements. And the smoothness condition becomes the mininization of the following

$\begin{displaymath}s_{i,j} \stackrel{\triangle}{=} (u_x^2 + u_y^2) + (v_x^2 + v_... ...,j}-u_{i,j-1})^2+ (v_{i,j}-v_{i-1,j})^2+(v_{i,j}-v_{i,j-1})^2 \end{displaymath}$

and the optic flow constraint requires the minimization of the following

$\begin{displaymath}c_{i,j} \stackrel{\triangle}{=} (I_x u_{i,j}+I_y v_{i,j}+I_t)^2 \end{displaymath}$

Put the smoothness condition and optic flow constraint together, we want to find u_k,l's and v_k,l's that minimize

$\begin{displaymath}e \stackrel{\triangle}{=} \sum_i \sum_j (s_{i,j}+\lambda c_{i,j}) \end{displaymath}$

where $\lambda$ is a parameter for adjusting the relative importance of the two terms. To do so, we set the partial derivatives of e with respect to u_k,l and v_k,l to zero:

$\begin{displaymath}\frac{\partial e}{\partial u_{k,l}} 2(2u_{i,j}-(u_{i-1,j}+u_... ...}-\overline{u_{i,j}})+2\lambda(I_xu_{k,l}+I_yv_{k,l}+I_t)I_x=0 \end{displaymath}$

where

$\begin{displaymath}\overline{u_{i,j}}\stackrel{\triangle}{=}(u_{i-1,j}+u_{i,j-1})/2 \end{displaymath}$

can be considered as the local average of u_i,j. Note that the constant coefficients 2 and 4 can be dropped due to the arbitrary $\lambda$ , and the second equation above can be treated in exactly the same way. Now we can get

$\begin{displaymath}(1+\lambda I_x^2)u_{k,l}+\lambda I_xI_yv_{k,l}=\overline{u_{k,l}}-\lambda I_xI_t \end{displaymath}$

$\begin{displaymath}\lambda I_xI_yu_{k,l}+(1+\lambda I_x^2)v_{k,l}=\overline{v_{k,l}}-\lambda I_yI_t \end{displaymath}$

This linear equation system of two equations and two unknowns can be solved to get

$\begin{displaymath}u_{k,l}=\overline{u_{k,l}}-I_x \frac{\lambda I_x\overline{u_{... ... I_y\overline{v_{k,l}}+\lambda I_t} {(1+\lambda(I_x^2+I_y^2))} \end{displaymath}$

$\begin{displaymath}v_{k,l}=\overline{v_{k,l}}-I_y \frac{\lambda I_x\overline{u_{... ... I_y\overline{v_{k,l}}+\lambda I_t} {(1+\lambda(I_x^2+I_y^2))} \end{displaymath}$

This can be readily implemented by an iterative algorithm (Gauss-Seidel method):

$\begin{displaymath}u_{k,l}^{n+1}=\overline{u_{k,l}^n}-I_x D \end{displaymath}$

$\begin{displaymath}v_{k,l}^{n+1}=\overline{v_{k,l}^n}-I_y D \end{displaymath}$

where

$\begin{displaymath}D \stackrel{\triangle}{=} \frac{\lambda I_x\overline{u_{k,l}... ..._y\overline{v_{k,l}^n}+\lambda I_t} {(1+\lambda(I_x^2+I_y^2))} \end{displaymath}$

There are two major problems associated with this gradient based method. First, the method relies on the image gradient $\bigtriangledown I(x,y)=(I_x, I_y)$ assumed to be available at every location of the scene. But in fact gradient is not available (zero gradient) at all homogeneous regions of the image. A possible solution is to interpolate from gradients of neighboring regions. Second, the smoothness condition assumes local motion velocities are similar to each other, therefore causing inaccuracy at all motion discontinuities (boundaries of moving objects). One solution is to only smooth the motion velocities along the tangents of the image discontinuities ( boundaries) but not accross (gradient).

Next: Spatiotemporal energy based motion Up: The Models Previous: Correlation based motion detection

Ruye Wang
2000-04-25