IEEE Circuits and Systems Magazine - Q4 2019 - 27

have been adopted to optimize the RNN training process. One approach is to initialize the weights of RNNs
in a proper manner to eliminate vanishing gradient issue.
Or we can replace common activation functions (e.g. sigmoid or tanh) with rectified linear units (ReLU) [56], [57].
Deploying improved RNN structures, such as the Long
Short Term Memory (LSTM) network [55], [58], [59] or
the Gated Recurrent Unit (GRU) [60]-[62], has become
the preferred solution.
In the following, selected applications of deep discriminative algorithms in speech and audio signal generation are reviewed.
B. Applications
1) Speech Generation
Deep discriminative models based methods are actively
being investigated for acoustic modeling and adaption,
feature learning and waveform generation in speech
generation. For acoustic modeling, DNNs are used to
replace the Gaussian mixture models (GMMs) for the
evaluation between frames of acoustic observations
and HMM states [63]. In [64], a mixture density network
(MDN) was used as an acoustic model in statistical
parametric speech synthesis. MDNs give full probability density functions over real-valued output features
conditioned on the corresponding input features. This
approach addressed the restrictions of the lack of ability to predict variances and the unimodal nature of the
objective functions in statistical parametric speech
synthesis. DNNs can also be used to replace the decision trees in HMM-based statistical parametric speech
synthesis applications [47]. This alternative scheme assisted to break the limitations in the conventional decision tree-clustered context-dependent HMM-based
approach, such as the inefficient expression of complex
context dependencies and fragmented training data. In
[65], DNN-based expressive speech synthesis was comprehensively investigated, especially for multiple emotions representations. In [66], a proof-of-concept system
for speech texture synthesis and voice conversion was
introduced. A cost function with respect to the input
waveform samples was optimized in this system. Realistic speech babble and utterance in a different voice can
be reconstructed.
LSTM based networks are also demonstrated for
their surprising performance in statistical parametric
speech synthesis systems. In [67], a simplified LSTM architecture was proposed and can achieve similar performance in speech synthesis with fewer parameters
than vanilla LSTM. Another LSTM RNN was introduced
in [68], which utilized data from multiple languages and
speakers. Experimental results showed that this multiFOURTH QUARTER 2019

lingual statistical parametric speech synthesis system
can generate speech in multiple languages from a signal
model and the naturalness is satisfactory. SampleRNN,
as a state-of-the-art RNN based model, was proposed
in [69] for unconditional audio generation. This model
is able to capture underlying sources of variations in
the temporal sequences over very long time spans. The
samples generated by the SampleRNN are preferred by
human raters. In [70], RNNs with bi-directional LSTM
(BLSTM) cells were adopted to capture the correlations between any two instants in a speech. This hybrid system of DNN and BLSTM-RNN can outperform
both the traditional HMM-based and the more recent
DNN-based statistical parametric speech synthesis
systems with a smoother speech trajectory. The use
of the deep BLSTM-RNN was also investigated in [71]
for voice conversion. A sequence-based conversion
method was proposed to improve the naturalness and
the continuity of the converted speech. Deep BLSTMRNNs were used to model both the frame-wise relationship between source/target voice and the long-range
context-dependencies in the acoustic trajectory. The
resultant speech showed a better MOS performance
than DNN based methods.
2) Other Types Of Audio Signal
Recently, deep neural networks have been applied in
music generation to meet the demands of music composition on various application platforms. In [72],
an end-to-end melody and arrangement generation
framework was proposed. This framework can generate melody tracks with several accompanying tracks
played by diverse types of instruments via applying
a multi-instrument co-arrangement model. In [73],
a set of parallel, tied-weight recurrent networks was
trained to model the polyphonic music. Two modified
architectures were proposed and combined for the
music generation and prediction task. Experimental
results5 demonstrated that the proposed models can
reproduce complex rhythms, melodies and counterpoints in some cases. LSTMs were adopted in [74] for
the generation of polyphonic music. In this work, a
chord LSTM was used to predict a chord progression
based on a chord embedding. Then another LSTM was
used to generate polyphonic music from the predicted
chord progression. The produced music had a clear
long-term structure that sounds as harmonic as the
ones played by a musician.
In [75], a gated PixelCNN [76] was applied in a singing
synthesizer. The harmonic spectral envelope was modeled by the network. In [77], the relationship between
5

https://www.cs.hmc.edu/ ddjohnson/tied-parallel/
IEEE CIRCUITS AND SYSTEMS MAGAZINE

27



IEEE Circuits and Systems Magazine - Q4 2019

Table of Contents for the Digital Edition of IEEE Circuits and Systems Magazine - Q4 2019

Contents
IEEE Circuits and Systems Magazine - Q4 2019 - Cover1
IEEE Circuits and Systems Magazine - Q4 2019 - Cover2
IEEE Circuits and Systems Magazine - Q4 2019 - 1
IEEE Circuits and Systems Magazine - Q4 2019 - Contents
IEEE Circuits and Systems Magazine - Q4 2019 - 3
IEEE Circuits and Systems Magazine - Q4 2019 - 4
IEEE Circuits and Systems Magazine - Q4 2019 - 5
IEEE Circuits and Systems Magazine - Q4 2019 - 6
IEEE Circuits and Systems Magazine - Q4 2019 - 7
IEEE Circuits and Systems Magazine - Q4 2019 - 8
IEEE Circuits and Systems Magazine - Q4 2019 - 9
IEEE Circuits and Systems Magazine - Q4 2019 - 10
IEEE Circuits and Systems Magazine - Q4 2019 - 11
IEEE Circuits and Systems Magazine - Q4 2019 - 12
IEEE Circuits and Systems Magazine - Q4 2019 - 13
IEEE Circuits and Systems Magazine - Q4 2019 - 14
IEEE Circuits and Systems Magazine - Q4 2019 - 15
IEEE Circuits and Systems Magazine - Q4 2019 - 16
IEEE Circuits and Systems Magazine - Q4 2019 - 17
IEEE Circuits and Systems Magazine - Q4 2019 - 18
IEEE Circuits and Systems Magazine - Q4 2019 - 19
IEEE Circuits and Systems Magazine - Q4 2019 - 20
IEEE Circuits and Systems Magazine - Q4 2019 - 21
IEEE Circuits and Systems Magazine - Q4 2019 - 22
IEEE Circuits and Systems Magazine - Q4 2019 - 23
IEEE Circuits and Systems Magazine - Q4 2019 - 24
IEEE Circuits and Systems Magazine - Q4 2019 - 25
IEEE Circuits and Systems Magazine - Q4 2019 - 26
IEEE Circuits and Systems Magazine - Q4 2019 - 27
IEEE Circuits and Systems Magazine - Q4 2019 - 28
IEEE Circuits and Systems Magazine - Q4 2019 - 29
IEEE Circuits and Systems Magazine - Q4 2019 - 30
IEEE Circuits and Systems Magazine - Q4 2019 - 31
IEEE Circuits and Systems Magazine - Q4 2019 - 32
IEEE Circuits and Systems Magazine - Q4 2019 - 33
IEEE Circuits and Systems Magazine - Q4 2019 - 34
IEEE Circuits and Systems Magazine - Q4 2019 - 35
IEEE Circuits and Systems Magazine - Q4 2019 - 36
IEEE Circuits and Systems Magazine - Q4 2019 - 37
IEEE Circuits and Systems Magazine - Q4 2019 - 38
IEEE Circuits and Systems Magazine - Q4 2019 - 39
IEEE Circuits and Systems Magazine - Q4 2019 - 40
IEEE Circuits and Systems Magazine - Q4 2019 - 41
IEEE Circuits and Systems Magazine - Q4 2019 - 42
IEEE Circuits and Systems Magazine - Q4 2019 - 43
IEEE Circuits and Systems Magazine - Q4 2019 - 44
IEEE Circuits and Systems Magazine - Q4 2019 - 45
IEEE Circuits and Systems Magazine - Q4 2019 - 46
IEEE Circuits and Systems Magazine - Q4 2019 - 47
IEEE Circuits and Systems Magazine - Q4 2019 - 48
IEEE Circuits and Systems Magazine - Q4 2019 - 49
IEEE Circuits and Systems Magazine - Q4 2019 - 50
IEEE Circuits and Systems Magazine - Q4 2019 - 51
IEEE Circuits and Systems Magazine - Q4 2019 - 52
IEEE Circuits and Systems Magazine - Q4 2019 - 53
IEEE Circuits and Systems Magazine - Q4 2019 - 54
IEEE Circuits and Systems Magazine - Q4 2019 - 55
IEEE Circuits and Systems Magazine - Q4 2019 - 56
IEEE Circuits and Systems Magazine - Q4 2019 - 57
IEEE Circuits and Systems Magazine - Q4 2019 - 58
IEEE Circuits and Systems Magazine - Q4 2019 - 59
IEEE Circuits and Systems Magazine - Q4 2019 - 60
IEEE Circuits and Systems Magazine - Q4 2019 - 61
IEEE Circuits and Systems Magazine - Q4 2019 - 62
IEEE Circuits and Systems Magazine - Q4 2019 - 63
IEEE Circuits and Systems Magazine - Q4 2019 - 64
IEEE Circuits and Systems Magazine - Q4 2019 - 65
IEEE Circuits and Systems Magazine - Q4 2019 - 66
IEEE Circuits and Systems Magazine - Q4 2019 - 67
IEEE Circuits and Systems Magazine - Q4 2019 - 68
IEEE Circuits and Systems Magazine - Q4 2019 - Cover3
IEEE Circuits and Systems Magazine - Q4 2019 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2023Q3
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2023Q2
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2023Q1
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2022Q4
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2022Q3
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2022Q2
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2022Q1
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2021Q4
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2021q3
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2021q2
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2021q1
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2020q4
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2020q3
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2020q2
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2020q1
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2019q4
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2019q3
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2019q2
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2019q1
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2018q4
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2018q3
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2018q2
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2018q1
https://www.nxtbookmedia.com