Signal Processing - May 2017 - 83

interactions of pixels in the short range.
Since the patch size is small, the diversity
is less. One can use six filters to provide a
good approximation to the 5 # 5 source
patches, and all source patches share the
same six filters regardless of their spatial
location. After subsampling, the second
convolutional layer examines interaction
of pixels in the midrange. After another
subsampling, the whole spatial domain
shrinks to a size 5 # 5 so that it can take
global interaction into account using full
connection. Typically, the interaction
contains not only spatial but also spectral
elements (e.g., the RGB three channels
and multiple filter responses at the same
spatial location) and all interactions are
modeled by computational neurons as
given in (1).
It is typical to decompose a CNN into
two subnetworks: the feature extraction
(FE) subnet and the decision-making
(DM) subnet. The FE subnet consists
of multiple convolutional layers while
the DM subnet is composed of a couple
of fully connected layers. Roughly speaking, the FE subnet conducts clustering
aiming at a new representation through
a sequence of RECOS transforms. The
DM subnet links data representations to
decision labels, which is similar to the
classification role of MLPs. The exact
boundary between the FE subnet and
the DM subnet is actually blurred in the
LeNet-5. It can be either S4 or C5. If we
view S4 as the boundary, then C5 and F6
are two hidden layers of the DM subnet.
On the other hand, if we choose C5 as the
boundary, then there is only one hidden
layer (i.e., F6) in the DM subnet. Actual-

Input
32 × 32

ly, since these two subnets are connected
side by side, the transition from the representation to the classification happens
gradually and smoothly.
One main advantage of CNNs over
the support vector machine and the random forest classifiers is that the FE task is
automatically done through the BP from
the last layer to the first layer. Generally
speaking, discriminant features are difficult to find for traditional classifiers such
as the MLP, support vector machine, and
random forest. Such tasks, called feature
engineering, demand the domain knowledge. Furthermore, it is difficult to argue
that ad hoc features found empirically
are optimal in any sense. This explains
why the traditional computer vision
field is fragmented by different applications. After the emergence of CNNs, the
domain knowledge is no longer important
in FE, yet it plays a critical role in data
labeling (known as label engineering).
To give an example, anaconda, vipers,
titanoboa, cobras, rattlesnake, etc. are
finer classifications of snakes. It requires
expert knowledge to collect and label
their images. The CNN provides a powerful tool in data-driven supervised learning, where the emphasis is shifted from
"extracting features from the source data"
to "constructing data sets by pairing carefully selected data and their labels."

Single-layer RECOS transform

Our discussion applies to x and a if
n = 0 or augmented vectors xl and al if
n ! 0. For convenience, we only consider the case with n = 0. The generalization to n ! 0 is straightforward.

Clustering on sphere's surface
There is a modern interpretation to the
function of a single perceptron layer
based on the clustering notion. Since
data clustering is a well-understood discipline, one can understand the operation
of CNNs better if a connection between
the operation of a perceptron layer and
data clustering can be established. This
link was built in [1]. It will be repeated
below. Let
S = "x

x = 1 ,.

be an N-dimensional unit hypersphere (or
simply sphere). We consider clustering of
points in S using the geodesic distance.
For an arbitrary vector x to be a member
in S, we need to normalize it by its magnitude g = x . If x is an image patch, the
magnitude normalization after its mean
removal has a physical meaning: contrast
adjustment. When g is smaller than a
threshold, the patch is nearly flat. A flat
patch carries little visual information yet
its normalization does amplify noise. In
this case, it is better to treat it as a zero
vector. When g is larger than the threshold, vector x does represent a visual pattern, and humans perceive little difference
between the original and normalized
patches since the contrast has little effect
on visual patterns. Although this normalization procedure is not implemented in
today's CNNs, the following mathematical analysis can be significantly simplified while the essence of CNNs can still
be well captured.
The geodesic distance of two points,
x i and x j in S, is proportional to the

C3: f. Maps 16 at 10 × 10
C1: Feature Maps
S4: f. Maps 16 at 5 × 5
6 at 28 × 28
C5: Layer F6: Layer Output
S2: f. Maps
120
6 at 14 × 14
84
10

Convolutions

Subsampling

Gaussian Connections
Full Connection
Full Connection
Subsampling

Convolutions

FIGURE 2. The LeNet-5 architecture [7] as an exemplary CNN.
IEEE Signal Processing Magazine

|

May 2017

|

83



Table of Contents for the Digital Edition of Signal Processing - May 2017

Signal Processing - May 2017 - Cover1
Signal Processing - May 2017 - Cover2
Signal Processing - May 2017 - 1
Signal Processing - May 2017 - 2
Signal Processing - May 2017 - 3
Signal Processing - May 2017 - 4
Signal Processing - May 2017 - 5
Signal Processing - May 2017 - 6
Signal Processing - May 2017 - 7
Signal Processing - May 2017 - 8
Signal Processing - May 2017 - 9
Signal Processing - May 2017 - 10
Signal Processing - May 2017 - 11
Signal Processing - May 2017 - 12
Signal Processing - May 2017 - 13
Signal Processing - May 2017 - 14
Signal Processing - May 2017 - 15
Signal Processing - May 2017 - 16
Signal Processing - May 2017 - 17
Signal Processing - May 2017 - 18
Signal Processing - May 2017 - 19
Signal Processing - May 2017 - 20
Signal Processing - May 2017 - 21
Signal Processing - May 2017 - 22
Signal Processing - May 2017 - 23
Signal Processing - May 2017 - 24
Signal Processing - May 2017 - 25
Signal Processing - May 2017 - 26
Signal Processing - May 2017 - 27
Signal Processing - May 2017 - 28
Signal Processing - May 2017 - 29
Signal Processing - May 2017 - 30
Signal Processing - May 2017 - 31
Signal Processing - May 2017 - 32
Signal Processing - May 2017 - 33
Signal Processing - May 2017 - 34
Signal Processing - May 2017 - 35
Signal Processing - May 2017 - 36
Signal Processing - May 2017 - 37
Signal Processing - May 2017 - 38
Signal Processing - May 2017 - 39
Signal Processing - May 2017 - 40
Signal Processing - May 2017 - 41
Signal Processing - May 2017 - 42
Signal Processing - May 2017 - 43
Signal Processing - May 2017 - 44
Signal Processing - May 2017 - 45
Signal Processing - May 2017 - 46
Signal Processing - May 2017 - 47
Signal Processing - May 2017 - 48
Signal Processing - May 2017 - 49
Signal Processing - May 2017 - 50
Signal Processing - May 2017 - 51
Signal Processing - May 2017 - 52
Signal Processing - May 2017 - 53
Signal Processing - May 2017 - 54
Signal Processing - May 2017 - 55
Signal Processing - May 2017 - 56
Signal Processing - May 2017 - 57
Signal Processing - May 2017 - 58
Signal Processing - May 2017 - 59
Signal Processing - May 2017 - 60
Signal Processing - May 2017 - 61
Signal Processing - May 2017 - 62
Signal Processing - May 2017 - 63
Signal Processing - May 2017 - 64
Signal Processing - May 2017 - 65
Signal Processing - May 2017 - 66
Signal Processing - May 2017 - 67
Signal Processing - May 2017 - 68
Signal Processing - May 2017 - 69
Signal Processing - May 2017 - 70
Signal Processing - May 2017 - 71
Signal Processing - May 2017 - 72
Signal Processing - May 2017 - 73
Signal Processing - May 2017 - 74
Signal Processing - May 2017 - 75
Signal Processing - May 2017 - 76
Signal Processing - May 2017 - 77
Signal Processing - May 2017 - 78
Signal Processing - May 2017 - 79
Signal Processing - May 2017 - 80
Signal Processing - May 2017 - 81
Signal Processing - May 2017 - 82
Signal Processing - May 2017 - 83
Signal Processing - May 2017 - 84
Signal Processing - May 2017 - 85
Signal Processing - May 2017 - 86
Signal Processing - May 2017 - 87
Signal Processing - May 2017 - 88
Signal Processing - May 2017 - 89
Signal Processing - May 2017 - 90
Signal Processing - May 2017 - 91
Signal Processing - May 2017 - 92
Signal Processing - May 2017 - 93
Signal Processing - May 2017 - 94
Signal Processing - May 2017 - 95
Signal Processing - May 2017 - 96
Signal Processing - May 2017 - 97
Signal Processing - May 2017 - 98
Signal Processing - May 2017 - 99
Signal Processing - May 2017 - 100
Signal Processing - May 2017 - 101
Signal Processing - May 2017 - 102
Signal Processing - May 2017 - 103
Signal Processing - May 2017 - 104
Signal Processing - May 2017 - 105
Signal Processing - May 2017 - 106
Signal Processing - May 2017 - 107
Signal Processing - May 2017 - 108
Signal Processing - May 2017 - 109
Signal Processing - May 2017 - 110
Signal Processing - May 2017 - 111
Signal Processing - May 2017 - 112
Signal Processing - May 2017 - Cover3
Signal Processing - May 2017 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201809
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201807
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201805
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201803
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201801
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0917
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0717
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0517
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0317
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0916
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0716
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0516
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0316
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0915
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0715
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0515
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0315
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0914
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0714
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0514
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0314
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0913
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0713
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0513
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0313
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0912
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0712
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0512
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0312
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0911
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0711
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0511
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0311
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0910
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0710
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0510
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0310
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0909
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0709
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0509
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0309
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1108
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0908
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0708
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0508
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0308
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0108
https://www.nxtbookmedia.com