Signal Processing - July 2017 - 124

Sparse coding, also termed dictionary learning, attempts to
by factors such as different speaker characteristics, noisy envifind succinct representations (i.e., atoms or elements of the dicronments, and poor recording channels. For example, Deng et
tionary) of the input data such that the input data can be repreal. [122] treated different corpora as different tasks for emosented as a linear combination of these sparse representations
tion recognition; Giris et al. [123] regarded noise type as an
[127]. Compared to the aforementioned feature transformaauxiliary task for speech recognition; and Seltzer and Droppo
tion methods, sparse coding has been demonstrated to be able
[128] treated phone label, phone text, and state context as difto produce a more robust signal representation in speech reconferent tasks when performing phoneme recognition. Recently,
struction and denoising tasks [118].
a universum autoencoder was proposed [129]. This technique
Conventional feature transformation approaches are typiuses a small amount of labeled data from the target domain and
cally executed at a shallow level. Recently, deep-learning
unlabeled data from a source domain to jointly minimize the
approaches for feature-based TL have begun
reconstruction error and the universum leanto attract a lot of of research attention.
ing loss. Motivated by these achievements
Researchers have started
Deep learning is regarded as a natural TL
of learning representations among multiple
to investigate the learning related tasks, researchers have started to
paradigm; it provides a powerful capability
of learning high-level abstracts or repreinvestigate the learning of robust represenof robust representations
sentations that are more robust against the
tations over multiple modalities (e.g., audio
over multiple modalities
variation of conventional speech features
and video) [130]. This topic, however, is
(e.g., audio and video).
(i.e., log Mel-filter banks and MFCCs) over
beyond the scope of this overview.
different domains [50] (see the "Unsupervised Representation Learning" section). These represenModel-based transfer learning
tative features can then be used as normal features to train
Model-based TL, also known as parameter-based TL, aims to
conventional discriminative or generative models, such as
learn a new model from an existing model that has been well
NNs, HMMs, and SVMs. Thanks to the invariant property
trained on rich source data. Unlike feature-based TL
of these representations, they can potentially deliver remarkapproaches, which usually transform the feature spaces,
able performance improvements for almost all ASA tasks [7],
model-based TL approaches modify the pretrained model
[50], [51], [58].
parameters (i) to account for the differences that may exist
In addition to the basic representation learning approaches
between the domains. This can be formulated as
mentioned previously, more advanced topologies have begun
P (X S, YS; i S) " P (X T , YT ; i T )
to emerge, which explicitly involve several related tasks in a
(17)
multitask learning paradigm. Multitask learning is the process
of learning multiple tasks at the same time to learn a shared
for a generative model or
representation among different tasks. Mathematically, when
P ^YS X S; i S h " P ^YT X T ; i T h
training the model with multiple tasks, we aim to minimize
(18)
the objective function as follows:
for a discriminative model.
K
Early-stage model-based TL approaches in the speech comm
2
i ,
(16)
J (i 0) = / / L (x ki, y ki; i k) +
2 0
munity included maximum a posteriori (MAP) estimation and
k=1 i
maximum likelihood linear regression (MLLR), which are
where K is the number of tasks, L (ยท) denotes the loss funcdesigned for generative models (e.g., GMM-HMM). These
tion, and i 0 stands for the general model parameters.
techniques have been applied successively to speaker adapWhen performing deep multitask learning for multilintation [131], where the speech from each specific speaker is
gual or cross lingual speech recognition, it is typical to share
supposed to be in a different domain with the initial training
the hidden layers across all languages [119], [120]. If learned
data. They have also been shown to be useful in computational
appropriately, the hidden layers serve as increasingly comparalinguistics tasks, such as depression detection [132].
plex feature transformations, sharing common hidden facSpecifically, MAP uses the speaker-independent models
tors across the acoustic data from different languages. The
(i.e., universal background models) as a prior probability
final softmax layers, however, are not shared. Instead, each
distribution over the model parameters, and then performs
language has its own softmax layer to estimate the postemaximum likelihood estimates by considering the model
rior probabilities specific to that language, using the most
parameters obtained on the speaker-dependent data. Alterabstract representation from the topmost hidden layer. The
natively, MLLR calculates a set of linear regression transstrong result gained using this topology [119], [120] indicates
formations to shift both the means and the covariances in
its potential; it opens up the possibility for quickly building
an initial Gaussian mixture HMM system so that each state
a high-performance recognition system for a new language
in the system is more likely to have generated the speaker
using an existing multilingual DNN.
data the model is being adapted to [131]. Compared with
Many other deep multitask learning derivatives have been
MAP, MLLR requires fewer adaptive data. Aside from
investigated to overcome the feature variation problems caused
speaker adaptation, these methods have been applied to
124

IEEE SIGNAL PROCESSING MAGAZINE

|

July 2017

|



Table of Contents for the Digital Edition of Signal Processing - July 2017

Signal Processing - July 2017 - Cover1
Signal Processing - July 2017 - Cover2
Signal Processing - July 2017 - 1
Signal Processing - July 2017 - 2
Signal Processing - July 2017 - 3
Signal Processing - July 2017 - 4
Signal Processing - July 2017 - 5
Signal Processing - July 2017 - 6
Signal Processing - July 2017 - 7
Signal Processing - July 2017 - 8
Signal Processing - July 2017 - 9
Signal Processing - July 2017 - 10
Signal Processing - July 2017 - 11
Signal Processing - July 2017 - 12
Signal Processing - July 2017 - 13
Signal Processing - July 2017 - 14
Signal Processing - July 2017 - 15
Signal Processing - July 2017 - 16
Signal Processing - July 2017 - 17
Signal Processing - July 2017 - 18
Signal Processing - July 2017 - 19
Signal Processing - July 2017 - 20
Signal Processing - July 2017 - 21
Signal Processing - July 2017 - 22
Signal Processing - July 2017 - 23
Signal Processing - July 2017 - 24
Signal Processing - July 2017 - 25
Signal Processing - July 2017 - 26
Signal Processing - July 2017 - 27
Signal Processing - July 2017 - 28
Signal Processing - July 2017 - 29
Signal Processing - July 2017 - 30
Signal Processing - July 2017 - 31
Signal Processing - July 2017 - 32
Signal Processing - July 2017 - 33
Signal Processing - July 2017 - 34
Signal Processing - July 2017 - 35
Signal Processing - July 2017 - 36
Signal Processing - July 2017 - 37
Signal Processing - July 2017 - 38
Signal Processing - July 2017 - 39
Signal Processing - July 2017 - 40
Signal Processing - July 2017 - 41
Signal Processing - July 2017 - 42
Signal Processing - July 2017 - 43
Signal Processing - July 2017 - 44
Signal Processing - July 2017 - 45
Signal Processing - July 2017 - 46
Signal Processing - July 2017 - 47
Signal Processing - July 2017 - 48
Signal Processing - July 2017 - 49
Signal Processing - July 2017 - 50
Signal Processing - July 2017 - 51
Signal Processing - July 2017 - 52
Signal Processing - July 2017 - 53
Signal Processing - July 2017 - 54
Signal Processing - July 2017 - 55
Signal Processing - July 2017 - 56
Signal Processing - July 2017 - 57
Signal Processing - July 2017 - 58
Signal Processing - July 2017 - 59
Signal Processing - July 2017 - 60
Signal Processing - July 2017 - 61
Signal Processing - July 2017 - 62
Signal Processing - July 2017 - 63
Signal Processing - July 2017 - 64
Signal Processing - July 2017 - 65
Signal Processing - July 2017 - 66
Signal Processing - July 2017 - 67
Signal Processing - July 2017 - 68
Signal Processing - July 2017 - 69
Signal Processing - July 2017 - 70
Signal Processing - July 2017 - 71
Signal Processing - July 2017 - 72
Signal Processing - July 2017 - 73
Signal Processing - July 2017 - 74
Signal Processing - July 2017 - 75
Signal Processing - July 2017 - 76
Signal Processing - July 2017 - 77
Signal Processing - July 2017 - 78
Signal Processing - July 2017 - 79
Signal Processing - July 2017 - 80
Signal Processing - July 2017 - 81
Signal Processing - July 2017 - 82
Signal Processing - July 2017 - 83
Signal Processing - July 2017 - 84
Signal Processing - July 2017 - 85
Signal Processing - July 2017 - 86
Signal Processing - July 2017 - 87
Signal Processing - July 2017 - 88
Signal Processing - July 2017 - 89
Signal Processing - July 2017 - 90
Signal Processing - July 2017 - 91
Signal Processing - July 2017 - 92
Signal Processing - July 2017 - 93
Signal Processing - July 2017 - 94
Signal Processing - July 2017 - 95
Signal Processing - July 2017 - 96
Signal Processing - July 2017 - 97
Signal Processing - July 2017 - 98
Signal Processing - July 2017 - 99
Signal Processing - July 2017 - 100
Signal Processing - July 2017 - 101
Signal Processing - July 2017 - 102
Signal Processing - July 2017 - 103
Signal Processing - July 2017 - 104
Signal Processing - July 2017 - 105
Signal Processing - July 2017 - 106
Signal Processing - July 2017 - 107
Signal Processing - July 2017 - 108
Signal Processing - July 2017 - 109
Signal Processing - July 2017 - 110
Signal Processing - July 2017 - 111
Signal Processing - July 2017 - 112
Signal Processing - July 2017 - 113
Signal Processing - July 2017 - 114
Signal Processing - July 2017 - 115
Signal Processing - July 2017 - 116
Signal Processing - July 2017 - 117
Signal Processing - July 2017 - 118
Signal Processing - July 2017 - 119
Signal Processing - July 2017 - 120
Signal Processing - July 2017 - 121
Signal Processing - July 2017 - 122
Signal Processing - July 2017 - 123
Signal Processing - July 2017 - 124
Signal Processing - July 2017 - 125
Signal Processing - July 2017 - 126
Signal Processing - July 2017 - 127
Signal Processing - July 2017 - 128
Signal Processing - July 2017 - 129
Signal Processing - July 2017 - 130
Signal Processing - July 2017 - 131
Signal Processing - July 2017 - 132
Signal Processing - July 2017 - 133
Signal Processing - July 2017 - 134
Signal Processing - July 2017 - 135
Signal Processing - July 2017 - 136
Signal Processing - July 2017 - 137
Signal Processing - July 2017 - 138
Signal Processing - July 2017 - 139
Signal Processing - July 2017 - 140
Signal Processing - July 2017 - 141
Signal Processing - July 2017 - 142
Signal Processing - July 2017 - 143
Signal Processing - July 2017 - 144
Signal Processing - July 2017 - 145
Signal Processing - July 2017 - 146
Signal Processing - July 2017 - 147
Signal Processing - July 2017 - 148
Signal Processing - July 2017 - 149
Signal Processing - July 2017 - 150
Signal Processing - July 2017 - 151
Signal Processing - July 2017 - 152
Signal Processing - July 2017 - 153
Signal Processing - July 2017 - 154
Signal Processing - July 2017 - 155
Signal Processing - July 2017 - 156
Signal Processing - July 2017 - 157
Signal Processing - July 2017 - 158
Signal Processing - July 2017 - 159
Signal Processing - July 2017 - 160
Signal Processing - July 2017 - 161
Signal Processing - July 2017 - 162
Signal Processing - July 2017 - 163
Signal Processing - July 2017 - 164
Signal Processing - July 2017 - 165
Signal Processing - July 2017 - 166
Signal Processing - July 2017 - 167
Signal Processing - July 2017 - 168
Signal Processing - July 2017 - 169
Signal Processing - July 2017 - 170
Signal Processing - July 2017 - 171
Signal Processing - July 2017 - 172
Signal Processing - July 2017 - 173
Signal Processing - July 2017 - 174
Signal Processing - July 2017 - 175
Signal Processing - July 2017 - 176
Signal Processing - July 2017 - 177
Signal Processing - July 2017 - 178
Signal Processing - July 2017 - 179
Signal Processing - July 2017 - 180
Signal Processing - July 2017 - 181
Signal Processing - July 2017 - 182
Signal Processing - July 2017 - 183
Signal Processing - July 2017 - 184
Signal Processing - July 2017 - 185
Signal Processing - July 2017 - 186
Signal Processing - July 2017 - 187
Signal Processing - July 2017 - 188
Signal Processing - July 2017 - 189
Signal Processing - July 2017 - 190
Signal Processing - July 2017 - 191
Signal Processing - July 2017 - 192
Signal Processing - July 2017 - 193
Signal Processing - July 2017 - 194
Signal Processing - July 2017 - 195
Signal Processing - July 2017 - 196
Signal Processing - July 2017 - Cover3
Signal Processing - July 2017 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201809
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201807
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201805
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201803
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201801
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0917
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0717
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0517
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0317
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0916
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0716
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0516
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0316
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0915
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0715
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0515
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0315
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0914
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0714
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0514
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0314
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0913
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0713
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0513
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0313
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0912
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0712
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0512
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0312
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0911
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0711
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0511
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0311
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0910
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0710
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0510
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0310
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0909
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0709
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0509
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0309
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1108
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0908
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0708
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0508
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0308
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0108
https://www.nxtbookmedia.com