IEEE Signal Processing - May 2018 - 107

performing tasks such as automatic dialect correction and
pronunciation improvements for people with phonological and
articulatory disorders that might eventually lead to intelligibility
enhancements.
Error minimization can perform error correction as
defined by the DIVA and HSFC models. Consider the task
of dialect correction, for example. Error minimization might
be implemented as an automatic accentedness evaluation of
nonnative speech [91], [92] in a closed-loop fashion.
Cognitive load, related to the cost of cognitive processing resources and listening effort, is currently studied in the
field of cognitive hearing science and by manufacturers of
hearing aid devices. CSC can be inspired by this ongoing
research to apply their results into intelligibility enhancements of transmitted speech signals.
5) CSC language independence: How can NN-based CSC
language be made independent?
Language-independent speech and language technology
belong to the main research topics of the speech signal
processing community. One approach is in modeling all of
the International Phonetic Alphabet (IPA) symbols. There
are more than 100 IPA symbols that can be further extended with more narrow phonetic and prosody annotations.
On the contrary, all of the speech sounds share some attributes, known in linguistics as phonological features or
phone attributes in the speech community. In phonology,
one example could be the sound pattern of the English set
consisting of 13 features [93]. In the speech community,
one major effort resulted in the Automatic Speech
Attribute Transcription project [94], which proposes a
framework for speech understanding based on processing
of cognitive hypotheses created from the acoustic and
auditory cues of the phone attributes. Recent phonetic
DNN training analysis confirmed that the hidden layers
learn an effective representational basis-the phone attributes-for the formation of invariant phonemic categories
[95]. Exploring this phonetic invariance across the languages is, therefore, a promising direction for the multilingual
CSC technologies.
6) CSC speaker recognizability: How can NN-based CSC
speaker be made independent?
Another very relevant problem is speaker recognizability.
Early work on speaker-adaptive phonetic vocoding resulted
in variable bit-rate coding [96]. Promising results have
shown recent advancements in NN-based generative models of raw audio, such as WaveNet from Google Research
[62] and Deep Voice 2 from Baidu Research [97], capable
of learning and imitating hundreds of voices within a single generative model.
7) CSC complexity: How can the computational complexity
of CSC be minimized?
Complexity of compressive sampling is about O (K log (N/K))
for K-sparse (K % N ) N-dimensional speech representation.
Computational complexity of a DNN is about N w, where
w is the number of weights of the DNN. The previously described compositional NN-based codec consisted of

12 analysis DNN, each trained with 2.46 million parameters, and one synthesis DNN trained with 3.31 million
parameters. Decoding with generative models of raw
audio with based on the advanced NN architectures, composed of deep convolutional and recurrent NNs, might have
significantly higher computational demand, not feasible for
current devices.

Summary and conclusions
This article described the impact of cognitive speech processing on speech compression. The basic concepts of
human cognitive speech processing and how CSC has
impacted current audio and speech coding systems have also
been discussed. Properties of CSC were then presented and
compared to LPC.
CSC has great potential to further impact current speech
coding standards. Deep learning and neural basis of speech
and language processing have tremendously advanced recently, and many of these findings can be applied to the field of
speech coding. This article concludes by outlining current
challenges in CSC, relying on incorporating phonological posteriors and temporal models, and employing speech
production models and cognitive feedback for enhanced
speech compression.

Authors
Milos Cernak (milos.cernak@ieee.org) received his B.S.,
M.S., and Ph.D. degrees in electrical engineering from the
Slovak University of Technology, Bratislava, in 1999, 2001,
and 2005, respectively. He is a lead speech-processing engineer
at Logitech, CTO office, in EPFL Innovation Park, Lausanne,
Switzerland. He received awards for his work in 2010 from the
Slovak Academy of Sciences for Best Research Team of the
Year, and in 2012 from the Slovak Ministry of Education for
research and development. His research interests include
speech signal processing, machine learning for phonetics and
phonology, neural basis of speech production and perception,
and pathological speech processing. He is a Senior Member of
the IEEE.
Afsaneh Asaei (asaeiaf@gmail.com) received her B.Sc. and
M.Sc. degrees in electrical and computer engineering from
Amirkabir (Tehran Polytechnique) and Sharif University of
Technology, Tehran, Iran, in 2002 and 2006, respectively, and
her Ph.D. degree in artificial intelligence from the École
Polytechnique Fédérale de Lausanne, Switzerland, in 2013.
Since 2002, she has been a research engineer on several projects
involving applications of machine learning and signal processing
for pattern recognition, voice detection, localization, separation,
enhancement, recognition, authentication, coding, quality
assessment, and evaluation for neurodegenerative pathology
with a focus on robustness and generalization. She is currently
the head of artificial intelligence at the Center for Innovation and
Business Creation at TU München, Germany. Her research
interests include artificial intelligence and health informatics.
Alexandre Hyafil (alexandre.hyafil@gmail.com) received
his B.Sc. and M.Sc. degrees from École Polytechnique,

IEEE Signal Processing Magazine

|

May 2018

|

107



Table of Contents for the Digital Edition of IEEE Signal Processing - May 2018

Contents
IEEE Signal Processing - May 2018 - Cover1
IEEE Signal Processing - May 2018 - Cover2
IEEE Signal Processing - May 2018 - Contents
IEEE Signal Processing - May 2018 - 2
IEEE Signal Processing - May 2018 - 3
IEEE Signal Processing - May 2018 - 4
IEEE Signal Processing - May 2018 - 5
IEEE Signal Processing - May 2018 - 6
IEEE Signal Processing - May 2018 - 7
IEEE Signal Processing - May 2018 - 8
IEEE Signal Processing - May 2018 - 9
IEEE Signal Processing - May 2018 - 10
IEEE Signal Processing - May 2018 - 11
IEEE Signal Processing - May 2018 - 12
IEEE Signal Processing - May 2018 - 13
IEEE Signal Processing - May 2018 - 14
IEEE Signal Processing - May 2018 - 15
IEEE Signal Processing - May 2018 - 16
IEEE Signal Processing - May 2018 - 17
IEEE Signal Processing - May 2018 - 18
IEEE Signal Processing - May 2018 - 19
IEEE Signal Processing - May 2018 - 20
IEEE Signal Processing - May 2018 - 21
IEEE Signal Processing - May 2018 - 22
IEEE Signal Processing - May 2018 - 23
IEEE Signal Processing - May 2018 - 24
IEEE Signal Processing - May 2018 - 25
IEEE Signal Processing - May 2018 - 26
IEEE Signal Processing - May 2018 - 27
IEEE Signal Processing - May 2018 - 28
IEEE Signal Processing - May 2018 - 29
IEEE Signal Processing - May 2018 - 30
IEEE Signal Processing - May 2018 - 31
IEEE Signal Processing - May 2018 - 32
IEEE Signal Processing - May 2018 - 33
IEEE Signal Processing - May 2018 - 34
IEEE Signal Processing - May 2018 - 35
IEEE Signal Processing - May 2018 - 36
IEEE Signal Processing - May 2018 - 37
IEEE Signal Processing - May 2018 - 38
IEEE Signal Processing - May 2018 - 39
IEEE Signal Processing - May 2018 - 40
IEEE Signal Processing - May 2018 - 41
IEEE Signal Processing - May 2018 - 42
IEEE Signal Processing - May 2018 - 43
IEEE Signal Processing - May 2018 - 44
IEEE Signal Processing - May 2018 - 45
IEEE Signal Processing - May 2018 - 46
IEEE Signal Processing - May 2018 - 47
IEEE Signal Processing - May 2018 - 48
IEEE Signal Processing - May 2018 - 49
IEEE Signal Processing - May 2018 - 50
IEEE Signal Processing - May 2018 - 51
IEEE Signal Processing - May 2018 - 52
IEEE Signal Processing - May 2018 - 53
IEEE Signal Processing - May 2018 - 54
IEEE Signal Processing - May 2018 - 55
IEEE Signal Processing - May 2018 - 56
IEEE Signal Processing - May 2018 - 57
IEEE Signal Processing - May 2018 - 58
IEEE Signal Processing - May 2018 - 59
IEEE Signal Processing - May 2018 - 60
IEEE Signal Processing - May 2018 - 61
IEEE Signal Processing - May 2018 - 62
IEEE Signal Processing - May 2018 - 63
IEEE Signal Processing - May 2018 - 64
IEEE Signal Processing - May 2018 - 65
IEEE Signal Processing - May 2018 - 66
IEEE Signal Processing - May 2018 - 67
IEEE Signal Processing - May 2018 - 68
IEEE Signal Processing - May 2018 - 69
IEEE Signal Processing - May 2018 - 70
IEEE Signal Processing - May 2018 - 71
IEEE Signal Processing - May 2018 - 72
IEEE Signal Processing - May 2018 - 73
IEEE Signal Processing - May 2018 - 74
IEEE Signal Processing - May 2018 - 75
IEEE Signal Processing - May 2018 - 76
IEEE Signal Processing - May 2018 - 77
IEEE Signal Processing - May 2018 - 78
IEEE Signal Processing - May 2018 - 79
IEEE Signal Processing - May 2018 - 80
IEEE Signal Processing - May 2018 - 81
IEEE Signal Processing - May 2018 - 82
IEEE Signal Processing - May 2018 - 83
IEEE Signal Processing - May 2018 - 84
IEEE Signal Processing - May 2018 - 85
IEEE Signal Processing - May 2018 - 86
IEEE Signal Processing - May 2018 - 87
IEEE Signal Processing - May 2018 - 88
IEEE Signal Processing - May 2018 - 89
IEEE Signal Processing - May 2018 - 90
IEEE Signal Processing - May 2018 - 91
IEEE Signal Processing - May 2018 - 92
IEEE Signal Processing - May 2018 - 93
IEEE Signal Processing - May 2018 - 94
IEEE Signal Processing - May 2018 - 95
IEEE Signal Processing - May 2018 - 96
IEEE Signal Processing - May 2018 - 97
IEEE Signal Processing - May 2018 - 98
IEEE Signal Processing - May 2018 - 99
IEEE Signal Processing - May 2018 - 100
IEEE Signal Processing - May 2018 - 101
IEEE Signal Processing - May 2018 - 102
IEEE Signal Processing - May 2018 - 103
IEEE Signal Processing - May 2018 - 104
IEEE Signal Processing - May 2018 - 105
IEEE Signal Processing - May 2018 - 106
IEEE Signal Processing - May 2018 - 107
IEEE Signal Processing - May 2018 - 108
IEEE Signal Processing - May 2018 - 109
IEEE Signal Processing - May 2018 - 110
IEEE Signal Processing - May 2018 - 111
IEEE Signal Processing - May 2018 - 112
IEEE Signal Processing - May 2018 - 113
IEEE Signal Processing - May 2018 - 114
IEEE Signal Processing - May 2018 - 115
IEEE Signal Processing - May 2018 - 116
IEEE Signal Processing - May 2018 - 117
IEEE Signal Processing - May 2018 - 118
IEEE Signal Processing - May 2018 - 119
IEEE Signal Processing - May 2018 - 120
IEEE Signal Processing - May 2018 - 121
IEEE Signal Processing - May 2018 - 122
IEEE Signal Processing - May 2018 - 123
IEEE Signal Processing - May 2018 - 124
IEEE Signal Processing - May 2018 - 125
IEEE Signal Processing - May 2018 - 126
IEEE Signal Processing - May 2018 - 127
IEEE Signal Processing - May 2018 - 128
IEEE Signal Processing - May 2018 - 129
IEEE Signal Processing - May 2018 - 130
IEEE Signal Processing - May 2018 - 131
IEEE Signal Processing - May 2018 - 132
IEEE Signal Processing - May 2018 - Cover3
IEEE Signal Processing - May 2018 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201809
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201807
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201805
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201803
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201801
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0917
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0717
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0517
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0317
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0916
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0716
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0516
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0316
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0915
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0715
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0515
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0315
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0914
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0714
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0514
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0314
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0913
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0713
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0513
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0313
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0912
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0712
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0512
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0312
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0911
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0711
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0511
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0311
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0910
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0710
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0510
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0310
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0909
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0709
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0509
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0309
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1108
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0908
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0708
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0508
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0308
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0108
https://www.nxtbookmedia.com