Signal Processing - May 2017 - 39
dominance, whereas the effect that a single auditory event is
perceived when there are two sound events is called fusion. The
effect that the discrimination of the direction of the lagging sound
source is suppressed is termed lag discrimination suppression.
These three are collectively known as the precedence effect [16].
Another important binaural cue is interaural coherence (IC),
which is a measure of the coherence of signals received by the
two ears. IC is high for sounds coming directly from the source,
where the two ear signals are highly correlated, and low in the
diffuse sound field, where the correlation is low. Therefore, IC
provides information about the level of reverberation and thus
about the spaciousness of the environment.
The history of perceptually motivated
spatial audio
Binaural audio and multichannel stereophony are two of the
most common spatial audio technologies, predating more
recent technologies such as WFS by more than half a century.
Binaural audio has found extensive use and received renewed
interest, especially for virtual reality (VR) applications, while
multichannel systems have been the de facto standard for
home entertainment and automotive audio systems.
Simultaneously, due to the popularity and market dominance
of two-channel audio formats, stereophony using two loudspeakers is still commonly employed.
Binaural audio
Binaural audio is based on a simple assumption: if the signals
that would be received at the ears of a listener as a result of an
acoustic event are provided to the listener with sufficient accuracy, the person will perceive an auditory event corresponding
to the original acoustic event. These ear signals can either be
recorded with microphones implanted in the ear canals of an
artificial human head, such as Knowles Electronic Manikin
for Acoustic Research or Neumann KU-100, or synthesized
using signal processing methods. In both cases, the signals are
usually presented over a pair of headphones.
The microphones used for recording binaural audio are
also termed dummy head microphones and are manufactured
to resemble a typical human head. The external ears of these
microphones are typically molded in silicon and are modeled
after the external ears of humans who have exceptional spatial
hearing acuity. The recorded signals need to be played back
using headphones equalized appropriately with free-field or
diffuse-field equalization, depending on the environment in
which the recording was made [1].
Binaural synthesis is based on the knowledge of the acoustic
transfer paths between the source and the two ears. These paths
are characterized by their impulse responses, referred to as the
head-related impulse response (HRIR) and head-related transfer
function in the frequency domain. For each source position, there
will be two of them, one for the left ear and one for the right.
When HRIRs are convolved with dry source signals, the resulting
signals will incorporate the necessary binaural cues for the given
source position. In the case of a sound field created by P sources
in the far field, the right and left ear signals can be synthesized as
x L (n) =
P
/ x p (n) * h L,
i p, z p
(n),
(1)
i p, z p
(n),
(2)
p =1
x R (n) =
P
/ x p (n) * h R,
p =1
where x p (n) is the pressure signal due to source p; h L, i p, z p (n)
and h R, i p, z p (n) represent the HRIRs for the left and the right
ears for a source at a direction (i, z), where i and z are the
azimuth and elevation angles, respectively; and * denotes convolution. This approach assumes that the acoustical system
consisting of these sources and the listener is linear and time
invariant and that the resulting left and right ear signals
provide the necessary spatial hearing cues pertaining to the
acoustic field that would be generated by these P sources.
In free field, HRIRs can be considered finite and are typically up to 12-ms long, corresponding to approximately 512
samples at the 44.1-kHz sampling rate. This does not present
a significant computational cost for a single component. However, as the number of components increases, such as when a
source and its reflections in a room are being rendered, the
computational cost of convolution becomes an important
bottleneck. To overcome this limitation, different filter design
approaches have been proposed, e.g., [17]. These filters are
designed to capture salient binaural cues while significantly
reducing the computational cost.
Two essential requirements of binaural synthesis are 1) the
availability of a set of HRIR measurements densely sampled on a
spherical shell and 2) the match between these HRIRs and the actual
HRIRs of the listener. Regarding the first requirement, interpolation methods such as kernel regression [18] can be used to increase
the granularity of the available directions. The second requirement
necessitates the measurement of individualized HRIRs, which is
both time consuming and costly. For that reason, many existing
research-grade and commercial solutions use generic HRIRs. This,
however, is not an ideal solution, since there are significant differences
between the spectra of the generic HRIRs and individual HRIRs of
the listener, and these cues are essential for elevation perception [13].
Practical setups that allow quick measurement of HRIRs around a
geodesic sphere surrounding the listener's head have recently been
developed [19]. There also exist commercial products that allow
tailoring a stored set of HRIRs based on head size (https://www
.ossic.com). However, head size alone can improve only the ITD and
ILD cues provided by the system, not the spectral cues used in the
perception of source elevation.
Binaural synthesis also allows interactivity if the position
and orientation of the listener's head can be tracked [20]. Highprecision and high-accuracy magnetic trackers had been the
de facto method for tracking a listener's head. Recent developments made it possible to track a user's head with inexpensive
devices (http://www.3dsoundlabs.com). These developments
make binaural synthesis an excellent solution for VR applications. For binaural synthesis, a side effect of system errors-
such as a pair of improperly equalized headphones, an HRIR
set that does not well match the HRIRs of the user, or inaccurate head tracking-is inside-the-head localization [21]. This
IEEE Signal Processing Magazine
|
May 2017
|
39
https://www
http://www.ossic.com
http://www.3dsoundlabs.com
Table of Contents for the Digital Edition of Signal Processing - May 2017
Signal Processing - May 2017 - Cover1
Signal Processing - May 2017 - Cover2
Signal Processing - May 2017 - 1
Signal Processing - May 2017 - 2
Signal Processing - May 2017 - 3
Signal Processing - May 2017 - 4
Signal Processing - May 2017 - 5
Signal Processing - May 2017 - 6
Signal Processing - May 2017 - 7
Signal Processing - May 2017 - 8
Signal Processing - May 2017 - 9
Signal Processing - May 2017 - 10
Signal Processing - May 2017 - 11
Signal Processing - May 2017 - 12
Signal Processing - May 2017 - 13
Signal Processing - May 2017 - 14
Signal Processing - May 2017 - 15
Signal Processing - May 2017 - 16
Signal Processing - May 2017 - 17
Signal Processing - May 2017 - 18
Signal Processing - May 2017 - 19
Signal Processing - May 2017 - 20
Signal Processing - May 2017 - 21
Signal Processing - May 2017 - 22
Signal Processing - May 2017 - 23
Signal Processing - May 2017 - 24
Signal Processing - May 2017 - 25
Signal Processing - May 2017 - 26
Signal Processing - May 2017 - 27
Signal Processing - May 2017 - 28
Signal Processing - May 2017 - 29
Signal Processing - May 2017 - 30
Signal Processing - May 2017 - 31
Signal Processing - May 2017 - 32
Signal Processing - May 2017 - 33
Signal Processing - May 2017 - 34
Signal Processing - May 2017 - 35
Signal Processing - May 2017 - 36
Signal Processing - May 2017 - 37
Signal Processing - May 2017 - 38
Signal Processing - May 2017 - 39
Signal Processing - May 2017 - 40
Signal Processing - May 2017 - 41
Signal Processing - May 2017 - 42
Signal Processing - May 2017 - 43
Signal Processing - May 2017 - 44
Signal Processing - May 2017 - 45
Signal Processing - May 2017 - 46
Signal Processing - May 2017 - 47
Signal Processing - May 2017 - 48
Signal Processing - May 2017 - 49
Signal Processing - May 2017 - 50
Signal Processing - May 2017 - 51
Signal Processing - May 2017 - 52
Signal Processing - May 2017 - 53
Signal Processing - May 2017 - 54
Signal Processing - May 2017 - 55
Signal Processing - May 2017 - 56
Signal Processing - May 2017 - 57
Signal Processing - May 2017 - 58
Signal Processing - May 2017 - 59
Signal Processing - May 2017 - 60
Signal Processing - May 2017 - 61
Signal Processing - May 2017 - 62
Signal Processing - May 2017 - 63
Signal Processing - May 2017 - 64
Signal Processing - May 2017 - 65
Signal Processing - May 2017 - 66
Signal Processing - May 2017 - 67
Signal Processing - May 2017 - 68
Signal Processing - May 2017 - 69
Signal Processing - May 2017 - 70
Signal Processing - May 2017 - 71
Signal Processing - May 2017 - 72
Signal Processing - May 2017 - 73
Signal Processing - May 2017 - 74
Signal Processing - May 2017 - 75
Signal Processing - May 2017 - 76
Signal Processing - May 2017 - 77
Signal Processing - May 2017 - 78
Signal Processing - May 2017 - 79
Signal Processing - May 2017 - 80
Signal Processing - May 2017 - 81
Signal Processing - May 2017 - 82
Signal Processing - May 2017 - 83
Signal Processing - May 2017 - 84
Signal Processing - May 2017 - 85
Signal Processing - May 2017 - 86
Signal Processing - May 2017 - 87
Signal Processing - May 2017 - 88
Signal Processing - May 2017 - 89
Signal Processing - May 2017 - 90
Signal Processing - May 2017 - 91
Signal Processing - May 2017 - 92
Signal Processing - May 2017 - 93
Signal Processing - May 2017 - 94
Signal Processing - May 2017 - 95
Signal Processing - May 2017 - 96
Signal Processing - May 2017 - 97
Signal Processing - May 2017 - 98
Signal Processing - May 2017 - 99
Signal Processing - May 2017 - 100
Signal Processing - May 2017 - 101
Signal Processing - May 2017 - 102
Signal Processing - May 2017 - 103
Signal Processing - May 2017 - 104
Signal Processing - May 2017 - 105
Signal Processing - May 2017 - 106
Signal Processing - May 2017 - 107
Signal Processing - May 2017 - 108
Signal Processing - May 2017 - 109
Signal Processing - May 2017 - 110
Signal Processing - May 2017 - 111
Signal Processing - May 2017 - 112
Signal Processing - May 2017 - Cover3
Signal Processing - May 2017 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201809
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201807
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201805
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201803
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201801
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0917
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0717
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0517
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0317
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0916
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0716
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0516
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0316
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0915
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0715
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0515
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0315
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0914
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0714
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0514
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0314
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0913
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0713
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0513
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0313
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0912
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0712
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0512
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0312
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0911
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0711
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0511
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0311
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0910
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0710
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0510
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0310
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0909
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0709
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0509
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0309
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1108
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0908
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0708
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0508
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0308
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0108
https://www.nxtbookmedia.com