IEEE Signal Processing - May 2018 - 106

1) CSC model architecture: What type, size, and composition of NNs are suitable for CSC?
More than 77% of all coding distortion of NN codec comes
from the parametric vocoding used for speech resynthesis.
This challenge can, therefore, be inspired by recent advancements in end-to-end speech synthesis research. Modeling of
the long-term context in CSC can be investigated by adapting spiking or recurrent NNs.
2) CSC speech parameters: What is an efficient speech representation for CSC, and how is its sparse coding designed?
CSC speech parameters should be based on cortical representation introduced in the "Human Cognitive Speech
Processing" section. It includes sparse short-term (physiological) and long-term (linguistic) representations. The parameters
have to be efficiently estimated by the CSC model.
3)
CSC adaptability: How is computational sensory feedCurrent challenges in cognitive speech coding
back realized?
The following outlines current challenges in cognitive
The goal is to implement an adaptation of the cortical
speech coding.
speech representation based on sensory feedback. In general, it consists
of articulatory gestural feedback
Short-Term Analysis
Training
Classification
and auditory feedback. The latter is
broadly related to auditory targets
Z
used in the nonintrusive estimation
MFCC
of speech quality, such as the ITU-T
recommendation P.563 that defines
a perceptual model of speech. CSC
Backpropagation
Forwardintroduces a novel feedback for
Propagation
the speech code constructed from
Force-Alignment Labels
the auditory and articulatory targets,
(a)
currently not used by waveform/
LPC coding.
Training
Long-Term (Pitch) Analysis
Syllabification
An example of adaptive speech
E
activity
detection has been demonE
Energy
Pitch
strated
[44].
The system employs a
+
Parameters
2-D
Gabor
filter
bank whose paramMFCCs
I
I
eters are retuned offline to improve
the separability between the feature
Spike
InhibitoryTrains Update
representation of speech and nonSpike Bursts
Syllable Boundaries
speech sounds, and it attempts to
(b)
minimize the misclassification risk
of mismatched data, with respect to
Short-Term Analysis
Training
Regression
the original statistical models.
4) CSC robustness: How is attention
...
focused on CSC to increase intelligiSynthesis
bility or to decrease cognitive load?
CSC benefits from the biological cogAcoustic Targets
nitive aspects of speech communiBackpropagation
Forwardcation by trying to design speech
Propagation
Acoustic
codecs that are aware of the hidden
Parameters
structure of the transmitted speech sig(c)
nal (speech code), while the transmitted code remains linguistically
Figure 8. (a) The training and inference stages for an analysis DNN, (b) a spiking NN, and (c) a synrelevant. Linguistically relevant transthesis DNN. Training of the analysis DNN and the spiking NN requires phonetic and syllabic boundary
mission codes bring novel functionallabels, respectively, whereas training of the synthesis DNN does not require force-aligned labels. Pitch
ity to speech transmission systems,
is the fundamental frequency of the sound signal [90].
106

Anterior

Nasal

Decoding is realized as a synthesis DNN that learns the
highly complex mapping of the transmitted sequence, Z, to
the speech parameters, and consists of two computational
steps. The first step is a DNN forward pass that generates
the speech parameters, and the second step is the generation of the speech samples from the speech parameters
including a decoded pitch signal. The pitch signal is coded
with a codebook that contains the logarithm of the continuous pitch of all of the syllables of the training data, parametrized with the discrete Legendre orthogonal polynomials.
Intelligibility evaluation of this NN codec demonstrates
about 10% degradation, when comparing with LPC (Speex)
coding; however, it operates at a bit rate approximately six
times lower.

IEEE Signal Processing Magazine

|

May 2018

|



Table of Contents for the Digital Edition of IEEE Signal Processing - May 2018

Contents
IEEE Signal Processing - May 2018 - Cover1
IEEE Signal Processing - May 2018 - Cover2
IEEE Signal Processing - May 2018 - Contents
IEEE Signal Processing - May 2018 - 2
IEEE Signal Processing - May 2018 - 3
IEEE Signal Processing - May 2018 - 4
IEEE Signal Processing - May 2018 - 5
IEEE Signal Processing - May 2018 - 6
IEEE Signal Processing - May 2018 - 7
IEEE Signal Processing - May 2018 - 8
IEEE Signal Processing - May 2018 - 9
IEEE Signal Processing - May 2018 - 10
IEEE Signal Processing - May 2018 - 11
IEEE Signal Processing - May 2018 - 12
IEEE Signal Processing - May 2018 - 13
IEEE Signal Processing - May 2018 - 14
IEEE Signal Processing - May 2018 - 15
IEEE Signal Processing - May 2018 - 16
IEEE Signal Processing - May 2018 - 17
IEEE Signal Processing - May 2018 - 18
IEEE Signal Processing - May 2018 - 19
IEEE Signal Processing - May 2018 - 20
IEEE Signal Processing - May 2018 - 21
IEEE Signal Processing - May 2018 - 22
IEEE Signal Processing - May 2018 - 23
IEEE Signal Processing - May 2018 - 24
IEEE Signal Processing - May 2018 - 25
IEEE Signal Processing - May 2018 - 26
IEEE Signal Processing - May 2018 - 27
IEEE Signal Processing - May 2018 - 28
IEEE Signal Processing - May 2018 - 29
IEEE Signal Processing - May 2018 - 30
IEEE Signal Processing - May 2018 - 31
IEEE Signal Processing - May 2018 - 32
IEEE Signal Processing - May 2018 - 33
IEEE Signal Processing - May 2018 - 34
IEEE Signal Processing - May 2018 - 35
IEEE Signal Processing - May 2018 - 36
IEEE Signal Processing - May 2018 - 37
IEEE Signal Processing - May 2018 - 38
IEEE Signal Processing - May 2018 - 39
IEEE Signal Processing - May 2018 - 40
IEEE Signal Processing - May 2018 - 41
IEEE Signal Processing - May 2018 - 42
IEEE Signal Processing - May 2018 - 43
IEEE Signal Processing - May 2018 - 44
IEEE Signal Processing - May 2018 - 45
IEEE Signal Processing - May 2018 - 46
IEEE Signal Processing - May 2018 - 47
IEEE Signal Processing - May 2018 - 48
IEEE Signal Processing - May 2018 - 49
IEEE Signal Processing - May 2018 - 50
IEEE Signal Processing - May 2018 - 51
IEEE Signal Processing - May 2018 - 52
IEEE Signal Processing - May 2018 - 53
IEEE Signal Processing - May 2018 - 54
IEEE Signal Processing - May 2018 - 55
IEEE Signal Processing - May 2018 - 56
IEEE Signal Processing - May 2018 - 57
IEEE Signal Processing - May 2018 - 58
IEEE Signal Processing - May 2018 - 59
IEEE Signal Processing - May 2018 - 60
IEEE Signal Processing - May 2018 - 61
IEEE Signal Processing - May 2018 - 62
IEEE Signal Processing - May 2018 - 63
IEEE Signal Processing - May 2018 - 64
IEEE Signal Processing - May 2018 - 65
IEEE Signal Processing - May 2018 - 66
IEEE Signal Processing - May 2018 - 67
IEEE Signal Processing - May 2018 - 68
IEEE Signal Processing - May 2018 - 69
IEEE Signal Processing - May 2018 - 70
IEEE Signal Processing - May 2018 - 71
IEEE Signal Processing - May 2018 - 72
IEEE Signal Processing - May 2018 - 73
IEEE Signal Processing - May 2018 - 74
IEEE Signal Processing - May 2018 - 75
IEEE Signal Processing - May 2018 - 76
IEEE Signal Processing - May 2018 - 77
IEEE Signal Processing - May 2018 - 78
IEEE Signal Processing - May 2018 - 79
IEEE Signal Processing - May 2018 - 80
IEEE Signal Processing - May 2018 - 81
IEEE Signal Processing - May 2018 - 82
IEEE Signal Processing - May 2018 - 83
IEEE Signal Processing - May 2018 - 84
IEEE Signal Processing - May 2018 - 85
IEEE Signal Processing - May 2018 - 86
IEEE Signal Processing - May 2018 - 87
IEEE Signal Processing - May 2018 - 88
IEEE Signal Processing - May 2018 - 89
IEEE Signal Processing - May 2018 - 90
IEEE Signal Processing - May 2018 - 91
IEEE Signal Processing - May 2018 - 92
IEEE Signal Processing - May 2018 - 93
IEEE Signal Processing - May 2018 - 94
IEEE Signal Processing - May 2018 - 95
IEEE Signal Processing - May 2018 - 96
IEEE Signal Processing - May 2018 - 97
IEEE Signal Processing - May 2018 - 98
IEEE Signal Processing - May 2018 - 99
IEEE Signal Processing - May 2018 - 100
IEEE Signal Processing - May 2018 - 101
IEEE Signal Processing - May 2018 - 102
IEEE Signal Processing - May 2018 - 103
IEEE Signal Processing - May 2018 - 104
IEEE Signal Processing - May 2018 - 105
IEEE Signal Processing - May 2018 - 106
IEEE Signal Processing - May 2018 - 107
IEEE Signal Processing - May 2018 - 108
IEEE Signal Processing - May 2018 - 109
IEEE Signal Processing - May 2018 - 110
IEEE Signal Processing - May 2018 - 111
IEEE Signal Processing - May 2018 - 112
IEEE Signal Processing - May 2018 - 113
IEEE Signal Processing - May 2018 - 114
IEEE Signal Processing - May 2018 - 115
IEEE Signal Processing - May 2018 - 116
IEEE Signal Processing - May 2018 - 117
IEEE Signal Processing - May 2018 - 118
IEEE Signal Processing - May 2018 - 119
IEEE Signal Processing - May 2018 - 120
IEEE Signal Processing - May 2018 - 121
IEEE Signal Processing - May 2018 - 122
IEEE Signal Processing - May 2018 - 123
IEEE Signal Processing - May 2018 - 124
IEEE Signal Processing - May 2018 - 125
IEEE Signal Processing - May 2018 - 126
IEEE Signal Processing - May 2018 - 127
IEEE Signal Processing - May 2018 - 128
IEEE Signal Processing - May 2018 - 129
IEEE Signal Processing - May 2018 - 130
IEEE Signal Processing - May 2018 - 131
IEEE Signal Processing - May 2018 - 132
IEEE Signal Processing - May 2018 - Cover3
IEEE Signal Processing - May 2018 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201809
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201807
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201805
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201803
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201801
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0917
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0717
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0517
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0317
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0916
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0716
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0516
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0316
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0915
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0715
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0515
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0315
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0914
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0714
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0514
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0314
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0913
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0713
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0513
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0313
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0912
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0712
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0512
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0312
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0911
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0711
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0511
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0311
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0910
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0710
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0510
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0310
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0909
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0709
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0509
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0309
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1108
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0908
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0708
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0508
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0308
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0108
https://www.nxtbookmedia.com