Signal Processing - November 2016 - 54

to engage the driver. Available services (e.g., gas station,
eyes-busy scenarios. Because of the special nature of in-vehicle
or parking) via TTS from users' mobile devices add
use, however, additional challenges are apparent from in-vehicle
another dimension of complexity and are competing for
data collection efforts and in-vehicle dialog system development
drivers' attention. Managing multiple assistance systems
activities mentioned in the section "Review of Past Major Activfrom different devices will become an important developities." Some important challenges are summarized in Table 1.
ment consideration.
One may look at these challenges as arising from the three
interdependent factors: 1) the driver, 2) the environment, and 3)
■ The automotive industry. Two key issues are critical to
the automotive industry.
automotive companies' decisions about whether or not to
adopt new technologies. One is safety. Improper imple■ The driver. Drivers typically have short attention spans durmentation of certain technologies could lead to fatal acciing interaction with in-vehicle dialog systems, as driving is
dents. The other is the reliability. When a
their primary task. From 2020 to 2030,
component is installed in a vehicle, it must
drivers will begin to be exposed to autonThe automotive industry
work properly for a long time. Vehicle
omous driving technologies, freeing them
has taken two different
parts and features installed are typically
from constant attention to vehicle control.
approaches to address
required to function properly for at least
However, according to projections by
the arrival of cloud-based, ten years. For the safety and reliability reaMcKinsey [39], the adoption of self-drivvoice-enabled assistance
sons, this industry is highly regulated by
ing vehicles will likely be less than 15%
systems from major
the government. Relevant regulations and
by 2030. HMI while driving will continue
guidelines include ISO 26262 [29], ISO
to be important challenge. As a result,
IT companies.
15504 (Automotive SPICE), and IEC
CID systems have to handle challenges
61508 [30]. Due to such requirements, common practice
on many different levels [9]. Drivers' speech may be fragin the consumer electronics or Internet world of fast prodmented, disfluent, and repetitive. Drivers may need to hear
uct development cycle may not be directly applied in this
shorter or simpler responses; they may expect the system
industry. In the automotive industry, a rigorous process
to provide straightforward recommendations instead of a
covering design specification, development, testing, and
potentially overwhelming number of choices. When sysvalidation is followed to ensure the resulting product qualtems make recommendations, the systems should anticiity meets requirements. For example, to develop a CID
pate driver's contextual needs to the point that the driver
system, one needs to specify the system coverage and perhas the feeling that she or he doesn't need to state the
formance by listing many phrasal and interactive variaobvious, so driver behaviors and personal preferences
tions under different noise conditions by speakers from
need to be taken into consideration to avoid excessive condifferent dialect regions, leading to a huge number of
versational turns. Additionally, in one vehicle, there may
combined testing cases. A complete testing of these cases
be multiple passengers, each requiring a separate preferagainst various requirements becomes more and more chalence profile. A sophisticated dialog system should be able
lenging, especially with increasing coverage needs from
to coordinate requests from different users, know when to
the users.
not interrupt human-human conversations, and come up
The combination of these three factors causes many techniwith an optimized response for the group.
cal
challenges. We next highlight some critical ones that have
■ The environment. The in-car environment is generally
long-term impact in the areas of speech recognition and undermore dynamic and has higher stakes than the contexts in
standing, multiple speaker conversation coordination, the effect
which other dialog systems are deployed. Inside the vehiof driver behaviors and states for safety, as well as the integration
cle, information from sensors reflects vehicle status
of general intelligent assistance systems.
changes, some of which need immediate attention (an
engine breaking down), while others can be handled at a
later time (an oil change warning). Differences in design,
Challenges in speech recognition and understanding
interior materials, and mechanics create different acousFrom a historical perspective, speech recognition in the car
tic environments and background noises inside vehicles.
started with small vocabulary systems primarily for comOutside the vehicle, the physical environment is also
mand and control, along with optimization of either microdiverse and dynamic. A vehicle may be on a highway or a
phone placement or multimicrophone array processing to
city road, accelerating or decelerating, on gravel or
suppress the diverse noise sources present for in-vehicle
asphalt-creating significantly different background
scenarios [31]. More recent efforts have focused on
noises. Traffic conditions can vary significantly: the drivexpanding speech recognition coverage to additional iner may be stuck in stop-and-go traffic, or may be travelvehicle domains.
ing unimpeded at high speed. Harsh weather, such as
The in-vehicle acoustic environment is complex and dynamwind, rain, hail, or thunder, typically requires additional
ic. Factors in this acoustic environment include noise from
attention from the driver and alters the in-vehicle acousair conditioning units, wiper blades, the engine, external
tics. In such cases, the dialog system may need to use diftraffic, the road surface, wind, open windows, and inclement
ferent dialog strategies with respect to taking the initiative
weather. The level of background noise while windows were
54

IEEE SIgnal ProcESSIng MagazInE

|

November 2016

|



Table of Contents for the Digital Edition of Signal Processing - November 2016

Signal Processing - November 2016 - Cover1
Signal Processing - November 2016 - Cover2
Signal Processing - November 2016 - 1
Signal Processing - November 2016 - 2
Signal Processing - November 2016 - 3
Signal Processing - November 2016 - 4
Signal Processing - November 2016 - 5
Signal Processing - November 2016 - 6
Signal Processing - November 2016 - 7
Signal Processing - November 2016 - 8
Signal Processing - November 2016 - 9
Signal Processing - November 2016 - 10
Signal Processing - November 2016 - 11
Signal Processing - November 2016 - 12
Signal Processing - November 2016 - 13
Signal Processing - November 2016 - 14
Signal Processing - November 2016 - 15
Signal Processing - November 2016 - 16
Signal Processing - November 2016 - 17
Signal Processing - November 2016 - 18
Signal Processing - November 2016 - 19
Signal Processing - November 2016 - 20
Signal Processing - November 2016 - 21
Signal Processing - November 2016 - 22
Signal Processing - November 2016 - 23
Signal Processing - November 2016 - 24
Signal Processing - November 2016 - 25
Signal Processing - November 2016 - 26
Signal Processing - November 2016 - 27
Signal Processing - November 2016 - 28
Signal Processing - November 2016 - 29
Signal Processing - November 2016 - 30
Signal Processing - November 2016 - 31
Signal Processing - November 2016 - 32
Signal Processing - November 2016 - 33
Signal Processing - November 2016 - 34
Signal Processing - November 2016 - 35
Signal Processing - November 2016 - 36
Signal Processing - November 2016 - 37
Signal Processing - November 2016 - 38
Signal Processing - November 2016 - 39
Signal Processing - November 2016 - 40
Signal Processing - November 2016 - 41
Signal Processing - November 2016 - 42
Signal Processing - November 2016 - 43
Signal Processing - November 2016 - 44
Signal Processing - November 2016 - 45
Signal Processing - November 2016 - 46
Signal Processing - November 2016 - 47
Signal Processing - November 2016 - 48
Signal Processing - November 2016 - 49
Signal Processing - November 2016 - 50
Signal Processing - November 2016 - 51
Signal Processing - November 2016 - 52
Signal Processing - November 2016 - 53
Signal Processing - November 2016 - 54
Signal Processing - November 2016 - 55
Signal Processing - November 2016 - 56
Signal Processing - November 2016 - 57
Signal Processing - November 2016 - 58
Signal Processing - November 2016 - 59
Signal Processing - November 2016 - 60
Signal Processing - November 2016 - 61
Signal Processing - November 2016 - 62
Signal Processing - November 2016 - 63
Signal Processing - November 2016 - 64
Signal Processing - November 2016 - 65
Signal Processing - November 2016 - 66
Signal Processing - November 2016 - 67
Signal Processing - November 2016 - 68
Signal Processing - November 2016 - 69
Signal Processing - November 2016 - 70
Signal Processing - November 2016 - 71
Signal Processing - November 2016 - 72
Signal Processing - November 2016 - 73
Signal Processing - November 2016 - 74
Signal Processing - November 2016 - 75
Signal Processing - November 2016 - 76
Signal Processing - November 2016 - 77
Signal Processing - November 2016 - 78
Signal Processing - November 2016 - 79
Signal Processing - November 2016 - 80
Signal Processing - November 2016 - 81
Signal Processing - November 2016 - 82
Signal Processing - November 2016 - 83
Signal Processing - November 2016 - 84
Signal Processing - November 2016 - 85
Signal Processing - November 2016 - 86
Signal Processing - November 2016 - 87
Signal Processing - November 2016 - 88
Signal Processing - November 2016 - 89
Signal Processing - November 2016 - 90
Signal Processing - November 2016 - 91
Signal Processing - November 2016 - 92
Signal Processing - November 2016 - 93
Signal Processing - November 2016 - 94
Signal Processing - November 2016 - 95
Signal Processing - November 2016 - 96
Signal Processing - November 2016 - 97
Signal Processing - November 2016 - 98
Signal Processing - November 2016 - 99
Signal Processing - November 2016 - 100
Signal Processing - November 2016 - 101
Signal Processing - November 2016 - 102
Signal Processing - November 2016 - 103
Signal Processing - November 2016 - 104
Signal Processing - November 2016 - 105
Signal Processing - November 2016 - 106
Signal Processing - November 2016 - 107
Signal Processing - November 2016 - 108
Signal Processing - November 2016 - 109
Signal Processing - November 2016 - 110
Signal Processing - November 2016 - 111
Signal Processing - November 2016 - 112
Signal Processing - November 2016 - 113
Signal Processing - November 2016 - 114
Signal Processing - November 2016 - 115
Signal Processing - November 2016 - 116
Signal Processing - November 2016 - 117
Signal Processing - November 2016 - 118
Signal Processing - November 2016 - 119
Signal Processing - November 2016 - 120
Signal Processing - November 2016 - 121
Signal Processing - November 2016 - 122
Signal Processing - November 2016 - 123
Signal Processing - November 2016 - 124
Signal Processing - November 2016 - 125
Signal Processing - November 2016 - 126
Signal Processing - November 2016 - 127
Signal Processing - November 2016 - 128
Signal Processing - November 2016 - 129
Signal Processing - November 2016 - 130
Signal Processing - November 2016 - 131
Signal Processing - November 2016 - 132
Signal Processing - November 2016 - 133
Signal Processing - November 2016 - 134
Signal Processing - November 2016 - 135
Signal Processing - November 2016 - 136
Signal Processing - November 2016 - 137
Signal Processing - November 2016 - 138
Signal Processing - November 2016 - 139
Signal Processing - November 2016 - 140
Signal Processing - November 2016 - 141
Signal Processing - November 2016 - 142
Signal Processing - November 2016 - 143
Signal Processing - November 2016 - 144
Signal Processing - November 2016 - 145
Signal Processing - November 2016 - 146
Signal Processing - November 2016 - 147
Signal Processing - November 2016 - 148
Signal Processing - November 2016 - Cover3
Signal Processing - November 2016 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201809
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201807
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201805
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201803
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201801
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0917
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0717
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0517
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0317
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0916
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0716
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0516
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0316
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0915
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0715
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0515
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0315
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0914
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0714
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0514
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0314
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0913
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0713
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0513
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0313
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0912
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0712
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0512
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0312
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0911
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0711
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0511
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0311
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0910
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0710
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0510
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0310
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0909
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0709
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0509
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0309
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1108
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0908
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0708
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0508
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0308
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0108
https://www.nxtbookmedia.com