Signal Processing - May 2017 - 82
ture (i.e., perceptron layers) suitable for
parallel processing.
As compared with traditional pattern recognition techniques based on
simple linear analysis (e.g., linear discriminant analysis, principal component
analysis, etc.), MLPs provide a more
flexible mapping from the feature space
to the decision space, where the distribution of feature points of one class can be
nonconvex and irregular. It is built upon
a solid theoretical foundation proved by
Cybenko [4] and Hornik et al. [5]. That is,
a network with only one hidden layer can
be a universal approximator if there are
"enough" neurons.
Hidden 2
Hidden 1
Input
Output
FIGURE 1. An exemplary MLP with one input layer, two hidden layers, and one output layer.
Convolutional neural networks
denotes a nonlinear activation function,
and b is the intermediate result between
the two operations.
The function, f (.) , was chosen to
be a delayed step function in form of
f (b) = u (b - z) in [2], where f (b) = 1
if b $ z and 0 if b 1 z. A neuron is on
(in the one state) if the stimulus, b, is larger than threshold z. Otherwise, it is off
(in the zero state). Multiple neurons can
be flexibly connected into logic networks
as models in theoretical neurophysiology.
For the vision problem, input x
denotes an image (or image patch). The
neuron should not generate a response
for a flat patch since it does not carry
any visual pattern information. Thus, we
set b = 0 if all of its elements are equal
to a nonzero constant. It is then straightN
forward to derive / n = 1 a n + a 0 = 0
N
or a 0 =-/ n = 1 a n is a dependent variable. We can form augmented vectors
xl = (n, x 1, f , x N ) T ! R N +1 and al =
(a 0, a 1, f , a N ) T ! R N + 1 for x and a,
respectively. Without loss of generality,
we assume n = 0 in the following discussion. If n ! 0, we can either consider
the augmented vector space of xl or normalize input x to be a zero-mean vector
before the processing and add the mean
back after the processing.
Multilayer perceptrons
The perceptron was introduced by Rosenblatt in [3]. One can stack multiple perceptrons side by side to form a perceptron
layer and cascade multiple perceptron
layers into one network. It is called the
82
MLP or the feedforward neural network.
An exemplary MLP is shown in Figure 1. In general, it consists of a layer
of input nodes (the input layer), several
layers of intermediate nodes (the hidden
layers), and a layer of output nodes (the
output layer). These layers are indexed
from l = 0 to L, where the input and
output layers are indexed with 0 and L,
and the hidden layers are indexed from
l = 1gL - 1, respectively. Suppose that
there are N l nodes at the lth layer. Each
node at the lth layer takes all nodes in the
(i - 1) th layer as its input. For this reason, it is called the fully connected layer.
Clearly, the MLP is end-to-end fully connected. A modern CNN often contains an
MLP as its building module.
MLPs were studied intensively in the
1980s and 1990s as decision networks
for pattern recognition applications. The
input and output nodes represent selected
features and classification types, respectively. There are two major advances
from simple neuron-based logic networks
to MLPs. First, there was no training
mechanism in the former since they were
not designed for the machine-learning
purpose. The BP technique was introduced in MLPs as a training mechanism
for supervised learning. Since differentiation is needed in the BP yet the step
function is not differentiable, other nonlinear activation functions are adopted
in MLPs. Examples include the sigmoid
function, the rectified linear unit (ReLU)
and the parameterized ReLU (PReLU).
Second, MLPs have a modularized strucIEEE Signal Processing Magazine
|
May 2017
|
Fukushima's neocognitron [6] can be
viewed as an early form of a CNN. The
architecture introduced by LeCun et al. in
[7] serves as the basis of modern CNNs.
The main difference between MLPs
and CNNs lies in their input space-the
former are features while the latter are
source data such as image, video, speech,
etc. This is not a trivial difference. Let us
use the LeNet-5 shown in Figure 2 as an
example, whose input is an image of size
32 # 32. Each pixel is an input node. It
would be very challenging for an MLP
to handle this input since the dimension
of the input vector is 32 # 32 = 1, 024.
The diversity of possible visual patterns is
huge. As explained later, the nodes in the
first hidden layer should provide a good
representation for the input signal. Thus,
it implies a large number of nodes in hidden layers. The number of links (or filter
weights) between the input and the first
hidden layers is N 0 # N 1 due to full connection. This number can easily go to the
order of millions. If the image dimension
is in the order of millions such as those
captured by today's smartphones, the
solution is clearly unrealistic.
Instead of considering interactions of
all pixels in one step as done in the MLP,
the CNN decomposes an input image
into smaller patches, known as receptive fields, for nodes at certain layers. It
gradually enlarges the receptive field to
cover a larger portion of the image. For
example, the filter size of the first two
convolutional layers of LeNet-5 is 5 × 5.
The first convolutional layer considers
Table of Contents for the Digital Edition of Signal Processing - May 2017
Signal Processing - May 2017 - Cover1
Signal Processing - May 2017 - Cover2
Signal Processing - May 2017 - 1
Signal Processing - May 2017 - 2
Signal Processing - May 2017 - 3
Signal Processing - May 2017 - 4
Signal Processing - May 2017 - 5
Signal Processing - May 2017 - 6
Signal Processing - May 2017 - 7
Signal Processing - May 2017 - 8
Signal Processing - May 2017 - 9
Signal Processing - May 2017 - 10
Signal Processing - May 2017 - 11
Signal Processing - May 2017 - 12
Signal Processing - May 2017 - 13
Signal Processing - May 2017 - 14
Signal Processing - May 2017 - 15
Signal Processing - May 2017 - 16
Signal Processing - May 2017 - 17
Signal Processing - May 2017 - 18
Signal Processing - May 2017 - 19
Signal Processing - May 2017 - 20
Signal Processing - May 2017 - 21
Signal Processing - May 2017 - 22
Signal Processing - May 2017 - 23
Signal Processing - May 2017 - 24
Signal Processing - May 2017 - 25
Signal Processing - May 2017 - 26
Signal Processing - May 2017 - 27
Signal Processing - May 2017 - 28
Signal Processing - May 2017 - 29
Signal Processing - May 2017 - 30
Signal Processing - May 2017 - 31
Signal Processing - May 2017 - 32
Signal Processing - May 2017 - 33
Signal Processing - May 2017 - 34
Signal Processing - May 2017 - 35
Signal Processing - May 2017 - 36
Signal Processing - May 2017 - 37
Signal Processing - May 2017 - 38
Signal Processing - May 2017 - 39
Signal Processing - May 2017 - 40
Signal Processing - May 2017 - 41
Signal Processing - May 2017 - 42
Signal Processing - May 2017 - 43
Signal Processing - May 2017 - 44
Signal Processing - May 2017 - 45
Signal Processing - May 2017 - 46
Signal Processing - May 2017 - 47
Signal Processing - May 2017 - 48
Signal Processing - May 2017 - 49
Signal Processing - May 2017 - 50
Signal Processing - May 2017 - 51
Signal Processing - May 2017 - 52
Signal Processing - May 2017 - 53
Signal Processing - May 2017 - 54
Signal Processing - May 2017 - 55
Signal Processing - May 2017 - 56
Signal Processing - May 2017 - 57
Signal Processing - May 2017 - 58
Signal Processing - May 2017 - 59
Signal Processing - May 2017 - 60
Signal Processing - May 2017 - 61
Signal Processing - May 2017 - 62
Signal Processing - May 2017 - 63
Signal Processing - May 2017 - 64
Signal Processing - May 2017 - 65
Signal Processing - May 2017 - 66
Signal Processing - May 2017 - 67
Signal Processing - May 2017 - 68
Signal Processing - May 2017 - 69
Signal Processing - May 2017 - 70
Signal Processing - May 2017 - 71
Signal Processing - May 2017 - 72
Signal Processing - May 2017 - 73
Signal Processing - May 2017 - 74
Signal Processing - May 2017 - 75
Signal Processing - May 2017 - 76
Signal Processing - May 2017 - 77
Signal Processing - May 2017 - 78
Signal Processing - May 2017 - 79
Signal Processing - May 2017 - 80
Signal Processing - May 2017 - 81
Signal Processing - May 2017 - 82
Signal Processing - May 2017 - 83
Signal Processing - May 2017 - 84
Signal Processing - May 2017 - 85
Signal Processing - May 2017 - 86
Signal Processing - May 2017 - 87
Signal Processing - May 2017 - 88
Signal Processing - May 2017 - 89
Signal Processing - May 2017 - 90
Signal Processing - May 2017 - 91
Signal Processing - May 2017 - 92
Signal Processing - May 2017 - 93
Signal Processing - May 2017 - 94
Signal Processing - May 2017 - 95
Signal Processing - May 2017 - 96
Signal Processing - May 2017 - 97
Signal Processing - May 2017 - 98
Signal Processing - May 2017 - 99
Signal Processing - May 2017 - 100
Signal Processing - May 2017 - 101
Signal Processing - May 2017 - 102
Signal Processing - May 2017 - 103
Signal Processing - May 2017 - 104
Signal Processing - May 2017 - 105
Signal Processing - May 2017 - 106
Signal Processing - May 2017 - 107
Signal Processing - May 2017 - 108
Signal Processing - May 2017 - 109
Signal Processing - May 2017 - 110
Signal Processing - May 2017 - 111
Signal Processing - May 2017 - 112
Signal Processing - May 2017 - Cover3
Signal Processing - May 2017 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201809
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201807
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201805
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201803
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201801
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0917
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0717
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0517
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0317
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0916
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0716
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0516
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0316
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0915
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0715
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0515
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0315
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0914
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0714
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0514
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0314
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0913
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0713
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0513
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0313
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0912
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0712
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0512
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0312
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0911
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0711
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0511
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0311
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0910
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0710
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0510
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0310
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0909
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0709
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0509
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0309
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1108
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0908
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0708
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0508
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0308
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0108
https://www.nxtbookmedia.com