Image size can be easily reduced by subsampling, e.g., getting
rid of every other pixel in each row and column:
or ![$\displaystyle \left[ \begin{array}{cc} 99 & 8 \\ 2 & 4 \end{array} \right]$](img10.svg) |
(9) |
In any of the four possible subsampling cases, three fourths of
the information contained in the original image is lost. A better
way is to find the average of a
neighborhood as the
resulting pixel:
![$\displaystyle \left[ \begin{array}{cccc}
1 & 2 & 4 & 3 \\ 2 & 99 & 3 & 8 \\
1 ...
...
\Rightarrow \left[ \begin{array}{cc}
26 & 4.5 \\ 2.5 & 2.5 \end{array} \right]$](img12.svg) |
(10) |
This is called average pooling in
convolutional neural network (CNN). Alternatively, the
maximum of the neighborhood can be used and it is called
maximum pooling in CNN.
Again, the average pooling can be implemented in a two-step
process:
- Regional averaging by convolving with
![$\displaystyle H_{2 \times 2}=\frac{1}{4}\left[ \begin{array}{cc}
1 & 1 \\ 1 & 1 \end{array} \right]$](img13.svg) |
(11) |
to get
![$\displaystyle f_{4 \times 4} * H_{2 \times 2}=f'_{4 \times 4}
=\left[ \begin{ar...
...& 2.5 \\
2.5 & 2.5 & 2.5 & 1.5 \\ 1.25 & 0.75 & 1.25 & 1.0
\end{array} \right]$](img14.svg) |
(12) |
- Subsampling
to get
![$\displaystyle \left[ \begin{array}{cc} 26 & 4.5 \\ 2.5 & 2.5 \end{array} \right]$](img16.svg) |
(13) |