Signal Processing - May 2017 - 86
αn
Xn
...
X
a1
a2 . . .
ak
FIGURE 5. The visualization of the anchor-position vector a n [1].
thoroughly studied in [1], and the main
result is summarized below. A representative 2-D input and its corresponding
anchor vectors are shown in Figure 5. Let
a n be a K-dimensional vector formed by
the same position (or element) of a k. It is
called the anchor-position vector since
it captures the position information of
anchor vectors. Although anchor vectors
a k capture global representative patterns
of x, they are weak in capturing position
sensitive information. This shortcoming
can be compensated for by modulating
outputs with elements of the anchor-position vector a n in the next layer.
Let us use layers S4, C5, and F6 in
LeNet-5 as an example. There are 120
anchor vectors of dimension 400 from
S4 to C5. We collect 400 anchor-position
vectors of dimension 120, multiply the
output at C5 by them to form a set of
modulated outputs, and then compute 84
anchor vectors of dimension 120 from C5
to F6. Note that the output at C5 contains
primarily the spectral information but
not the position information. If a position
in the input vectors has less consistent
information, the variance of its associated
anchor position vector will be larger and
the modulated output will be more random. As a result, its impact on the formation of the 84 anchor vectors is reduced.
For more details, we refer to the discussion in [1].
New clustering representation
We have a one-to-one association between
a data sample and its cluster in traditional
clustering schemes. However, this is not
the case in the RECOS transform. A new
clustering representation is adopted by
MLPs and CNNs. That is, for an input
vector x, the RECOS transform generates a set of K nonnegative correlation
86
values as the output vector of dimension
K. This representation enables repetitive clustering layer by layer as given in
(4). For an input, one can determine the
significance of clusters according to the
magnitude of the rectified output value. If
its magnitude for a cluster is zero, x is not
associated with that cluster. A cluster is
called a relevant or irrelevant one depending on whether it has an association with
x. Among all relevant ones, we call cluster i the primary cluster for input x if
i = arg max a Tk x.
k
The remaining relevant ones are auxiliary clusters.
The FE subnet uses anchor vectors to
capture local, midrange, and long-range
spatial patterns. It is difficult to predict
the clustering structure since new information is introduced at a new layer. The
DM subnet attempts to reduce the dimension of intermediate representations until
it reaches the dimension of the decision
space. We observe that the clustering
structure becomes more obvious as the
layer of the DM subnet goes deeper. That
is, the output value from the primary cluster is closer to unity while the number of
auxiliary clusters is fewer and their output
values become smaller. When this happens, an anchor vector provides a good
approximation to the centroid for the corresponding cluster.
The choice of anchor vector numbers,
K l, at the lth layer is an important problem
in the network design. If input data x l -1
has a clear clustering structure (say, with
h clusters), we can set K l = h. However,
this is often not the case. If K l is set to a
value too small, we are not able to capture
the clustering structure of x l -1 well, and
it will demand more layers to split them.
If K l is set to a value too large, there are
more anchor vectors than needed, and a
stronger overlap between rectified output
vectors will be observed. As a result, we
still need more layers to separate them.
Another way to control the clustering process is the choice of the threshold value, z, of the TReLU. A higher
threshold value can reduce the negative
impact of a larger K l value. The tradeoff between z and K l is an interesting
future research topic.
IEEE Signal Processing Magazine
|
May 2017
|
Network initialization and guided
anchor vector update
Data clustering plays a critical role in the
understanding of the underlying structure
of data. The k-means algorithm, which is
probably the most well-known clustering
method, has been widely used in pattern
recognition and supervised/unsupervised
learning. As discussed previously, each
CNN layer conducts data clustering on
the surface of a high-dimensional sphere
based on a rectified geodesic distance.
Here, we would like to understand the
effect of multiple layers in cascade from
the input data source to the output decision label. For unsupervised learning such
as image segmentation, several challenges
exist in data clustering [8]. Questions such
as "What is a cluster?" "How many clusters are present in the data?" and "Are the
discovered clusters and partition valid?"
remain open. These questions expose
the limit of unsupervised data clustering methods.
In the context of supervised learning,
traditional feature-based methods extract
features from data, conduct clustering
in the feature space, and, finally, build a
connection between clusters and decision
labels. Although it is relatively easy to
build a connection between the data and
labels through features, it is challenging to
find effective features. In this setting, the
dimension of the feature space is usually
significantly smaller than that of the data
space. As a consequence, it is unavoidable
to sacrifice rich diversity of input data.
Furthermore, the feature selection process is guided by humans based on their
domain knowledge (i.e., the most discriminant properties of different objects). This
process is heuristic. It can become overfit
easily. Human efforts are needed in both
data labeling and feature design.
CNNs offer an effective supervised
learning solution, where supervision is
conducted by a training process using
data labels. This supervision closes the
semantic gap between low-level representations (e.g., the pixel representation)
and high-level semantics. Furthermore,
the CNN self-organization capability was well discussed in the 1980s and
1990s, e.g., [6]. By self-organization, the
network can learn with little supervision.
To put the above two together, we expect
Table of Contents for the Digital Edition of Signal Processing - May 2017
Signal Processing - May 2017 - Cover1
Signal Processing - May 2017 - Cover2
Signal Processing - May 2017 - 1
Signal Processing - May 2017 - 2
Signal Processing - May 2017 - 3
Signal Processing - May 2017 - 4
Signal Processing - May 2017 - 5
Signal Processing - May 2017 - 6
Signal Processing - May 2017 - 7
Signal Processing - May 2017 - 8
Signal Processing - May 2017 - 9
Signal Processing - May 2017 - 10
Signal Processing - May 2017 - 11
Signal Processing - May 2017 - 12
Signal Processing - May 2017 - 13
Signal Processing - May 2017 - 14
Signal Processing - May 2017 - 15
Signal Processing - May 2017 - 16
Signal Processing - May 2017 - 17
Signal Processing - May 2017 - 18
Signal Processing - May 2017 - 19
Signal Processing - May 2017 - 20
Signal Processing - May 2017 - 21
Signal Processing - May 2017 - 22
Signal Processing - May 2017 - 23
Signal Processing - May 2017 - 24
Signal Processing - May 2017 - 25
Signal Processing - May 2017 - 26
Signal Processing - May 2017 - 27
Signal Processing - May 2017 - 28
Signal Processing - May 2017 - 29
Signal Processing - May 2017 - 30
Signal Processing - May 2017 - 31
Signal Processing - May 2017 - 32
Signal Processing - May 2017 - 33
Signal Processing - May 2017 - 34
Signal Processing - May 2017 - 35
Signal Processing - May 2017 - 36
Signal Processing - May 2017 - 37
Signal Processing - May 2017 - 38
Signal Processing - May 2017 - 39
Signal Processing - May 2017 - 40
Signal Processing - May 2017 - 41
Signal Processing - May 2017 - 42
Signal Processing - May 2017 - 43
Signal Processing - May 2017 - 44
Signal Processing - May 2017 - 45
Signal Processing - May 2017 - 46
Signal Processing - May 2017 - 47
Signal Processing - May 2017 - 48
Signal Processing - May 2017 - 49
Signal Processing - May 2017 - 50
Signal Processing - May 2017 - 51
Signal Processing - May 2017 - 52
Signal Processing - May 2017 - 53
Signal Processing - May 2017 - 54
Signal Processing - May 2017 - 55
Signal Processing - May 2017 - 56
Signal Processing - May 2017 - 57
Signal Processing - May 2017 - 58
Signal Processing - May 2017 - 59
Signal Processing - May 2017 - 60
Signal Processing - May 2017 - 61
Signal Processing - May 2017 - 62
Signal Processing - May 2017 - 63
Signal Processing - May 2017 - 64
Signal Processing - May 2017 - 65
Signal Processing - May 2017 - 66
Signal Processing - May 2017 - 67
Signal Processing - May 2017 - 68
Signal Processing - May 2017 - 69
Signal Processing - May 2017 - 70
Signal Processing - May 2017 - 71
Signal Processing - May 2017 - 72
Signal Processing - May 2017 - 73
Signal Processing - May 2017 - 74
Signal Processing - May 2017 - 75
Signal Processing - May 2017 - 76
Signal Processing - May 2017 - 77
Signal Processing - May 2017 - 78
Signal Processing - May 2017 - 79
Signal Processing - May 2017 - 80
Signal Processing - May 2017 - 81
Signal Processing - May 2017 - 82
Signal Processing - May 2017 - 83
Signal Processing - May 2017 - 84
Signal Processing - May 2017 - 85
Signal Processing - May 2017 - 86
Signal Processing - May 2017 - 87
Signal Processing - May 2017 - 88
Signal Processing - May 2017 - 89
Signal Processing - May 2017 - 90
Signal Processing - May 2017 - 91
Signal Processing - May 2017 - 92
Signal Processing - May 2017 - 93
Signal Processing - May 2017 - 94
Signal Processing - May 2017 - 95
Signal Processing - May 2017 - 96
Signal Processing - May 2017 - 97
Signal Processing - May 2017 - 98
Signal Processing - May 2017 - 99
Signal Processing - May 2017 - 100
Signal Processing - May 2017 - 101
Signal Processing - May 2017 - 102
Signal Processing - May 2017 - 103
Signal Processing - May 2017 - 104
Signal Processing - May 2017 - 105
Signal Processing - May 2017 - 106
Signal Processing - May 2017 - 107
Signal Processing - May 2017 - 108
Signal Processing - May 2017 - 109
Signal Processing - May 2017 - 110
Signal Processing - May 2017 - 111
Signal Processing - May 2017 - 112
Signal Processing - May 2017 - Cover3
Signal Processing - May 2017 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201809
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201807
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201805
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201803
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201801
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0917
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0717
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0517
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0317
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0916
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0716
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0516
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0316
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0915
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0715
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0515
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0315
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0914
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0714
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0514
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0314
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0913
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0713
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0513
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0313
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0912
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0712
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0512
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0312
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0911
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0711
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0511
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0311
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0910
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0710
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0510
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0310
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0909
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0709
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0509
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0309
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1108
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0908
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0708
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0508
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0308
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0108
https://www.nxtbookmedia.com