Signal Processing - November 2016 - 53

Table 1. Major technical challenges for the in-vehicle dialog system.
Dialog System/Components

Features

Challenges

ASR

Noise robustness

Robust ASR under different driving conditions with different types of noise environments,
e.g., road, wind, or traffic noises.

Cognitive state
sensitivity

Learning the driver cognitive states from speech for better recognition: what his/her
emotional state is, whether he/she is busy with maneuvering the vehicle, drowsy,
listening to radio or music, or thinking about something else.

Addressee detection

Discriminate between system-directed from human-directed (or ambient) speech
without using push-to-talk button. Tracking user conversation with the system for
multiple dialog turns (also DM).

Spoken language
understanding

Make sense from broken human speech with high level of hesitation, revision,
or wrong word order or high speech recognition error rate.

Multimodal understanding

Integration of input from other natural modes (e.g., gesture on surface or in air,
eye gaze) and accurate understanding of user intent.

IU

DM

Decision about what system states need to be communicated to the drivers and at what
conditions in coordination of the requests from the user in the dynamic changing context.
Careful management of driver emotion, and selected content to the driver for the condition.

Knowledge management

Domain knowledge to be selected based on dynamic context and personal preference
via user modeling and used to facilitate the IU as well as recommendation and decision
making.

Natural language generation

Context-dependent content generation and prioritization. Discourse-aware flexible
expression generation under different driving conditions and requested item
availability as the template-based approach will no longer be sufficient.

Application manager

System level

Fast multidomain integration

Multiple nomadic devices may be brought into the vehicles. To specify devices and
services so to voice-enable them, integrate devices dynamically (plug and play) or
perform service composition.

Multiagent integration

Coordination, integration, and enrichment of the output from multiple assistant agents
brought in primarily through various smart mobile devices, such as Siri, Google Now,
Cortana, or embedded systems.

Testing

Systematic testing of whole-system behavior with respect to system latency, response
appropriateness, and task-completion rate.

Specification/validation

Develop a detailed specification with many different possible combinations of scenarios,
and validate the resulting system according to system specification based on standards,
e.g., ISO 26262.

most well known in-vehicle dialog system products include those
limited in-vehicle domains [27]. These gains will likely carry
from Ford Sync (2007-2012). The GM CUE system took a major
over to other modules in the future.
step in 2012 by incorporating natural speech beyond restricted
While these general IPAs benefit from leveraging huge comgrammars into its embedded dialog system, resulting in a noticeputing resources in the cloud, they are highly dependent on Internet
ably improved user experience [28]. In the
connectivity. However, many remote locameantime, cloud-based intelligent assistance
tions, where drivers need navigation systems
While many forwardsystems have popularized in-vehicle speech
the most, do not have Internet connectivity.
looking building blocks
use, due to their relatively high languageThis crucial limitation provides a strong jusin dialog systems have
understanding accuracy and rich applications
tification for the role of an embedded global
been investigated in
not available in embedded in-vehicle systems.
positioning system and associated embedded
research projects, the
However, the impact of cloud-based speech
speech interface.
conversion rate into
technologies is mostly limited to the ASR
In summary, menu-structured multimodal
accuracy and text-to-speech (TTS) naturalinteraction is the current state-of-the-art techin-vehicle embedded
ness for a dialog system development. Despite
nology for operating in-vehicle infotainment
products has been slow.
all the technology and product limitation, it is
products. Embedded voice interface soluclear that a general direction for the in-vehicle interaction is contions are less popular, due to limited recognition accuracy and
versational and intelligent dialog (CID) systems.
lack of natural language capability. Cloud-based IPAs provide a
good alternative for in-vehicle uses but with only a shallow integration. While many forward-looking building blocks in dialog
Challenges
systems-such as disfluency detection and content manageIn many ways, the technical challenges for in-vehicle intelligent
ment-have been investigated in research projects, the conversion
dialog systems bear similarity to those intended for general-purrate into in-vehicle embedded products has been slow. Among the
pose dialog systems, especially in the context of hands-busy and
IEEE SIgnal ProcESSIng MagazInE

|

November 2016

|

53



Table of Contents for the Digital Edition of Signal Processing - November 2016

Signal Processing - November 2016 - Cover1
Signal Processing - November 2016 - Cover2
Signal Processing - November 2016 - 1
Signal Processing - November 2016 - 2
Signal Processing - November 2016 - 3
Signal Processing - November 2016 - 4
Signal Processing - November 2016 - 5
Signal Processing - November 2016 - 6
Signal Processing - November 2016 - 7
Signal Processing - November 2016 - 8
Signal Processing - November 2016 - 9
Signal Processing - November 2016 - 10
Signal Processing - November 2016 - 11
Signal Processing - November 2016 - 12
Signal Processing - November 2016 - 13
Signal Processing - November 2016 - 14
Signal Processing - November 2016 - 15
Signal Processing - November 2016 - 16
Signal Processing - November 2016 - 17
Signal Processing - November 2016 - 18
Signal Processing - November 2016 - 19
Signal Processing - November 2016 - 20
Signal Processing - November 2016 - 21
Signal Processing - November 2016 - 22
Signal Processing - November 2016 - 23
Signal Processing - November 2016 - 24
Signal Processing - November 2016 - 25
Signal Processing - November 2016 - 26
Signal Processing - November 2016 - 27
Signal Processing - November 2016 - 28
Signal Processing - November 2016 - 29
Signal Processing - November 2016 - 30
Signal Processing - November 2016 - 31
Signal Processing - November 2016 - 32
Signal Processing - November 2016 - 33
Signal Processing - November 2016 - 34
Signal Processing - November 2016 - 35
Signal Processing - November 2016 - 36
Signal Processing - November 2016 - 37
Signal Processing - November 2016 - 38
Signal Processing - November 2016 - 39
Signal Processing - November 2016 - 40
Signal Processing - November 2016 - 41
Signal Processing - November 2016 - 42
Signal Processing - November 2016 - 43
Signal Processing - November 2016 - 44
Signal Processing - November 2016 - 45
Signal Processing - November 2016 - 46
Signal Processing - November 2016 - 47
Signal Processing - November 2016 - 48
Signal Processing - November 2016 - 49
Signal Processing - November 2016 - 50
Signal Processing - November 2016 - 51
Signal Processing - November 2016 - 52
Signal Processing - November 2016 - 53
Signal Processing - November 2016 - 54
Signal Processing - November 2016 - 55
Signal Processing - November 2016 - 56
Signal Processing - November 2016 - 57
Signal Processing - November 2016 - 58
Signal Processing - November 2016 - 59
Signal Processing - November 2016 - 60
Signal Processing - November 2016 - 61
Signal Processing - November 2016 - 62
Signal Processing - November 2016 - 63
Signal Processing - November 2016 - 64
Signal Processing - November 2016 - 65
Signal Processing - November 2016 - 66
Signal Processing - November 2016 - 67
Signal Processing - November 2016 - 68
Signal Processing - November 2016 - 69
Signal Processing - November 2016 - 70
Signal Processing - November 2016 - 71
Signal Processing - November 2016 - 72
Signal Processing - November 2016 - 73
Signal Processing - November 2016 - 74
Signal Processing - November 2016 - 75
Signal Processing - November 2016 - 76
Signal Processing - November 2016 - 77
Signal Processing - November 2016 - 78
Signal Processing - November 2016 - 79
Signal Processing - November 2016 - 80
Signal Processing - November 2016 - 81
Signal Processing - November 2016 - 82
Signal Processing - November 2016 - 83
Signal Processing - November 2016 - 84
Signal Processing - November 2016 - 85
Signal Processing - November 2016 - 86
Signal Processing - November 2016 - 87
Signal Processing - November 2016 - 88
Signal Processing - November 2016 - 89
Signal Processing - November 2016 - 90
Signal Processing - November 2016 - 91
Signal Processing - November 2016 - 92
Signal Processing - November 2016 - 93
Signal Processing - November 2016 - 94
Signal Processing - November 2016 - 95
Signal Processing - November 2016 - 96
Signal Processing - November 2016 - 97
Signal Processing - November 2016 - 98
Signal Processing - November 2016 - 99
Signal Processing - November 2016 - 100
Signal Processing - November 2016 - 101
Signal Processing - November 2016 - 102
Signal Processing - November 2016 - 103
Signal Processing - November 2016 - 104
Signal Processing - November 2016 - 105
Signal Processing - November 2016 - 106
Signal Processing - November 2016 - 107
Signal Processing - November 2016 - 108
Signal Processing - November 2016 - 109
Signal Processing - November 2016 - 110
Signal Processing - November 2016 - 111
Signal Processing - November 2016 - 112
Signal Processing - November 2016 - 113
Signal Processing - November 2016 - 114
Signal Processing - November 2016 - 115
Signal Processing - November 2016 - 116
Signal Processing - November 2016 - 117
Signal Processing - November 2016 - 118
Signal Processing - November 2016 - 119
Signal Processing - November 2016 - 120
Signal Processing - November 2016 - 121
Signal Processing - November 2016 - 122
Signal Processing - November 2016 - 123
Signal Processing - November 2016 - 124
Signal Processing - November 2016 - 125
Signal Processing - November 2016 - 126
Signal Processing - November 2016 - 127
Signal Processing - November 2016 - 128
Signal Processing - November 2016 - 129
Signal Processing - November 2016 - 130
Signal Processing - November 2016 - 131
Signal Processing - November 2016 - 132
Signal Processing - November 2016 - 133
Signal Processing - November 2016 - 134
Signal Processing - November 2016 - 135
Signal Processing - November 2016 - 136
Signal Processing - November 2016 - 137
Signal Processing - November 2016 - 138
Signal Processing - November 2016 - 139
Signal Processing - November 2016 - 140
Signal Processing - November 2016 - 141
Signal Processing - November 2016 - 142
Signal Processing - November 2016 - 143
Signal Processing - November 2016 - 144
Signal Processing - November 2016 - 145
Signal Processing - November 2016 - 146
Signal Processing - November 2016 - 147
Signal Processing - November 2016 - 148
Signal Processing - November 2016 - Cover3
Signal Processing - November 2016 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201809
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201807
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201805
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201803
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201801
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0917
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0717
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0517
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0317
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0916
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0716
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0516
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0316
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0915
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0715
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0515
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0315
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0914
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0714
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0514
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0314
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0913
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0713
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0513
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0313
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0912
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0712
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0512
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0312
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0911
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0711
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0511
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0311
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0910
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0710
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0510
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0310
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0909
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0709
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0509
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0309
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1108
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0908
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0708
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0508
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0308
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0108
https://www.nxtbookmedia.com