Signal Processing - May 2017 - 38

process referred to as audio source culling. This article is concerned with spatial audio systems and methodologies that substantially rely on psychoacoustics.

Spatial auditory perception
The psychophysics of spatial hearing has been an active
research area for the past century. Most of the information
given in this section has been thoroughly reviewed by Blauert
in [13]. Interested readers are referred to this excellent volume
for more information and an extensive set of further references. The primary mechanism humans use to localize sound
sources in the horizontal plane is based on the differences
between the signals received by the two ears. Due to the spatial separation between the ears, the sound wave generated by
a sound source reaches the two ears with a different delay,
called the interaural time difference (ITD). Moreover, the
sound wave is scattered by the head, causing the level of the
signal at the ear farther away from the source, the contralateral
ear, to be reduced in comparison with the level of the signal at
the ear closer to the source, the ipsilateral ear. This level difference is called the interaural level difference (ILD).
The interaural time delay for a typical human head can vary
between ±750 ns in the acoustic free field. Humans can detect
ITDs as low as 10-20 ns at the front direction, corresponding to
about 1° in the horizontal plane. Similarly, the ILD is frequency
dependent and can be as high as 21 dB at 10 kHz. Sensitivity to
changes in the ILD is also frequency dependent. For instance,
for pure tones, it varies between 0.5 and 2.5 dB. In contrast with
the ITD, which is the primary localization cue at low frequencies, ILD cues are more important in sound source localization
at higher frequencies. This is due to the low level of scattering at
low frequencies when the wavelength is close to or larger than
the size of the head. ITD and ILD cues also change with the
distance of a sound source and the size of the head.
Note that ITD and ILD pairs do not uniquely specify the
source direction. For the purpose of illustration, if we assume a
spherical head, binaural cues will be identical for sound sources placed on cone-shaped surfaces at each side of the head.
These surfaces are called cones of confusion. In the horizontal plane, sources on the conic section that is the intersection
of the horizontal plane with the cone of confusion will have
front-back ambiguity. Humans can typically resolve this ambiguity by small head movements.
The elevation of a sound source is perceived based primarily on the spectral shaping of its signal that occurs as a result of
the scattering of the sound around the head. This spectral shaping depends on the elevation in a manner determined by the
sizes and shapes of the pinnae, head, and torso. Consequently,
the frequency content of the sound itself also affects the perception of the elevation of its source.
Subjective localization of sound sources involves a significant
level of uncertainty. Localization blur is the smallest change in
the direction of a source that will result in a change in its perceived direction. For sources in the horizontal plane, localization
blur is generally less than around 10°. For sources in the median
plane, localization blur on the order of 20° can be observed.
38

A related concept is locatedness, which refers to the perception
of the spatial extent of a sound source. This is an important attribute because the center of mass of a sound source can be localized accurately, yet the source can still be diffusely located. Two
other measures of spatial resolution of hearing are the minimum
audible angle (MAA) and minimum audible movement angle
(MAMA). The MAA corresponds to the minimum change in the
direction of a static source for a listener to discriminate it as being
to the left or to the right of the original direction. The MAMA,
on the other hand, is a measure of spatial resolution for moving
sources; it quantifies the smallest arc that a moving sound source
must travel to be discriminable from a stationary source [14].
The perception of the distance of a sound source is both
less reliable and less well understood than the perception of the
direction of a sound source. Several cues affect the perception
of distance. Among these, intensity is the only cue inherently
related to the sound source and is also the only absolute cue.
The other distance cues are related either to the environment
(the direct-to-reverberant energy ratio and lateral reflections),
the physical properties of the listener (e.g., auditory parallax),
or cognitive aspects (e.g., familiarity) [15].
An interesting property of distance perception is the overestimation and underestimation of distance at different ranges
and for different sounds. Apparent distances of sources far
away from a listener are underestimated, and those closer than
around 1-2 m are overestimated [15]. Familiarity, which is a
cognitive cue related to prior exposure to and knowledge of the
characteristics of the sound source, also has a similar effect.
For example, the distance of whispered speech is underestimated, while that of shouted speech is overestimated.
An important capability of the human auditory perception
mechanism lies in its ability to localize sources in reverberant
environments such as rooms and other enclosed spaces. This is
made possible by suppressing reflections that come immediately after the direct sound. When a broadband impulse and a
delayed copy of it are presented from different directions with
a short delay of less than 1 ms in between, a single auditory
event is perceived at a direction between the directions of the
two sources, gradually shifting toward the leading source as the
lag in the time of arrival increases. (An auditory event is defined
as an event perceived by a listener typically, but not necessarily, in response to a sound event.) This effect is known as summing localization, and both sources contribute to the perceived
direction of the auditory event. When the delay is between 1 and
5 ms, a single fused auditory event close to the leading source
can be heard. Within this delay range, the presence of the lagging source is audible since it changes the timbre of the auditory
event, but its direction cannot be easily discriminated. Above
5 ms, the broadband click and its echo are perceived as distinct
sound events. The time delay above which two distinct events are
heard is called the echo threshold. While the classic demonstration of these effects involves broadband click pairs, different signals will have different echo thresholds. For example, the echo
threshold can be as high as 20 ms for speech and music signals.
The effect that the direction of the auditory event depends
predominantly on the leading source is known as localization

IEEE Signal Processing Magazine

|

May 2017

|



Table of Contents for the Digital Edition of Signal Processing - May 2017

Signal Processing - May 2017 - Cover1
Signal Processing - May 2017 - Cover2
Signal Processing - May 2017 - 1
Signal Processing - May 2017 - 2
Signal Processing - May 2017 - 3
Signal Processing - May 2017 - 4
Signal Processing - May 2017 - 5
Signal Processing - May 2017 - 6
Signal Processing - May 2017 - 7
Signal Processing - May 2017 - 8
Signal Processing - May 2017 - 9
Signal Processing - May 2017 - 10
Signal Processing - May 2017 - 11
Signal Processing - May 2017 - 12
Signal Processing - May 2017 - 13
Signal Processing - May 2017 - 14
Signal Processing - May 2017 - 15
Signal Processing - May 2017 - 16
Signal Processing - May 2017 - 17
Signal Processing - May 2017 - 18
Signal Processing - May 2017 - 19
Signal Processing - May 2017 - 20
Signal Processing - May 2017 - 21
Signal Processing - May 2017 - 22
Signal Processing - May 2017 - 23
Signal Processing - May 2017 - 24
Signal Processing - May 2017 - 25
Signal Processing - May 2017 - 26
Signal Processing - May 2017 - 27
Signal Processing - May 2017 - 28
Signal Processing - May 2017 - 29
Signal Processing - May 2017 - 30
Signal Processing - May 2017 - 31
Signal Processing - May 2017 - 32
Signal Processing - May 2017 - 33
Signal Processing - May 2017 - 34
Signal Processing - May 2017 - 35
Signal Processing - May 2017 - 36
Signal Processing - May 2017 - 37
Signal Processing - May 2017 - 38
Signal Processing - May 2017 - 39
Signal Processing - May 2017 - 40
Signal Processing - May 2017 - 41
Signal Processing - May 2017 - 42
Signal Processing - May 2017 - 43
Signal Processing - May 2017 - 44
Signal Processing - May 2017 - 45
Signal Processing - May 2017 - 46
Signal Processing - May 2017 - 47
Signal Processing - May 2017 - 48
Signal Processing - May 2017 - 49
Signal Processing - May 2017 - 50
Signal Processing - May 2017 - 51
Signal Processing - May 2017 - 52
Signal Processing - May 2017 - 53
Signal Processing - May 2017 - 54
Signal Processing - May 2017 - 55
Signal Processing - May 2017 - 56
Signal Processing - May 2017 - 57
Signal Processing - May 2017 - 58
Signal Processing - May 2017 - 59
Signal Processing - May 2017 - 60
Signal Processing - May 2017 - 61
Signal Processing - May 2017 - 62
Signal Processing - May 2017 - 63
Signal Processing - May 2017 - 64
Signal Processing - May 2017 - 65
Signal Processing - May 2017 - 66
Signal Processing - May 2017 - 67
Signal Processing - May 2017 - 68
Signal Processing - May 2017 - 69
Signal Processing - May 2017 - 70
Signal Processing - May 2017 - 71
Signal Processing - May 2017 - 72
Signal Processing - May 2017 - 73
Signal Processing - May 2017 - 74
Signal Processing - May 2017 - 75
Signal Processing - May 2017 - 76
Signal Processing - May 2017 - 77
Signal Processing - May 2017 - 78
Signal Processing - May 2017 - 79
Signal Processing - May 2017 - 80
Signal Processing - May 2017 - 81
Signal Processing - May 2017 - 82
Signal Processing - May 2017 - 83
Signal Processing - May 2017 - 84
Signal Processing - May 2017 - 85
Signal Processing - May 2017 - 86
Signal Processing - May 2017 - 87
Signal Processing - May 2017 - 88
Signal Processing - May 2017 - 89
Signal Processing - May 2017 - 90
Signal Processing - May 2017 - 91
Signal Processing - May 2017 - 92
Signal Processing - May 2017 - 93
Signal Processing - May 2017 - 94
Signal Processing - May 2017 - 95
Signal Processing - May 2017 - 96
Signal Processing - May 2017 - 97
Signal Processing - May 2017 - 98
Signal Processing - May 2017 - 99
Signal Processing - May 2017 - 100
Signal Processing - May 2017 - 101
Signal Processing - May 2017 - 102
Signal Processing - May 2017 - 103
Signal Processing - May 2017 - 104
Signal Processing - May 2017 - 105
Signal Processing - May 2017 - 106
Signal Processing - May 2017 - 107
Signal Processing - May 2017 - 108
Signal Processing - May 2017 - 109
Signal Processing - May 2017 - 110
Signal Processing - May 2017 - 111
Signal Processing - May 2017 - 112
Signal Processing - May 2017 - Cover3
Signal Processing - May 2017 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201809
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201807
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201805
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201803
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201801
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0917
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0717
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0517
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0317
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0916
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0716
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0516
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0316
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0915
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0715
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0515
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0315
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0914
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0714
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0514
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0314
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0913
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0713
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0513
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0313
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0912
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0712
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0512
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0312
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0911
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0711
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0511
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0311
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0910
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0710
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0510
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0310
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0909
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0709
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0509
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0309
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1108
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0908
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0708
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0508
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0308
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0108
https://www.nxtbookmedia.com