Signal Processing - May 2017 - 43

Perceptually motivated multichannel
recording and reproduction
There has been some recent work in the direction of developing systematic frameworks for the design of multichannel
stereo systems, most notably vector-base amplitude panning
(VBAP), directional audio coding (DirAC), and perceptual sound-field reconstruction (PSR).

v2

v3

Vector-base amplitude panning
It was shown as early as 1973 that tangent panning provides a stereophonic image that is more robust to head rotations than sine
panning for the standard stereophonic loudspeaker setup [30].
Pulkki showed that tangent panning can be expressed using an
equivalent, vector-based formulation in the horizontal plane and
also proposed a three-dimensional (3-D) extension to two-channel intensity panning that allows rendering elevated virtual sources over flexible loudspeaker rigs [41]. This method is VBAP.
Originally, VBAP was designed for a loudspeaker array
with elements placed on a geodesic dome's vertices that are
situated at the acoustic far field of the listener. Figure 3 shows
a section of such a sphere with three loudspeakers, with a
listener positioned at the center of the array. The directions
of the three loudspeakers are indicated as v 1, v 2, and v 3, and
the corresponding gains as g 1, g 2, and g 3 . A virtual source in
a direction v s between the loudspeakers can be generated by
selecting the gains that satisfy v s = Vg, where V is a matrix
whose columns are the directions of the loudspeakers and
g = [g 1 g 2 g 3] T . In addition, the calculated loudspeaker gains
are normalized to keep the total power constant.
On the full geodesic sphere, active regions are selected
based on the closest three points on the grid, and only those
loudspeakers are used for source rendition. This is in contrast
with physically based approaches such as Ambisonics, where
even for a single source from a single direction, all loudspeakers are potentially active. A major assumption behind VBAP in
three dimensions is that summing localization would occur not
only with two sources but also with three. This assumption was
subjectively tested for different setups and virtual source directions, and it was shown to result in a good subjective localization accuracy for elevated virtual sources [42], [43].
An issue resulting from the utilization of intensity panning in VBAP is the nonuniformity of the spatial spread of the
panned source. More specifically, sources panned closer to the
actual loudspeakers in the reproduction rig have a smaller spatial spread, while virtual sources panned to directions between
loudspeakers have a larger spatial spread. The main cause of this
issue is the use of a single loudspeaker when the virtual source
direction coincides with the direction of that loudspeaker.
This issue was addressed by panning the virtual source to
multiple directions by using three loudspeakers (instead of two)
for all source directions in the horizontal plane, or four loudspeakers (instead of three) in the 3-D case. This approach was
called multiple-direction amplitude panning (MDAP) [44]. In
a study comparing VBAP with MDAP, it was shown that both
provide good subjective localization accuracy, with MDAP
being more accurate than VBAP [45]. In another, more recent

vs
v1

FIGURE 3. An arrangement of three loudspeakers and a phantom image
panned using VBAP. The vectors used in the formulation of VBAP are
also shown.

evaluation carried out within the context of the MPEG-H standard, VBAP resulted in very good subjective localization accuracy, including not only the source azimuth but also its distance
[46]. In yet another study, VBAP was shown to provide good
localization performance also for sources in the median plane
[47]. Note that VBAP is a technology for sound-field synthesis,
and in the context of sound-field recording and reproduction it
is used at the reproduction end of schemes such as DirAC.

Spatial encoding methods
A class of multichannel audio methods involves dividing
recorded signals into time or time-frequency bins and estimating certain spatial attributes within each bin. One of these methods is the spatial impulse response rendering (SIRR) method
[48], [49]. At the recording stage, SIRR records the impulse
response of a room using a B-format microphone, i.e., a microphone that provides the omnidirectional sound pressure component as well as the three axial pressure-gradient components of
the sound field [28]. The impulse response is first transformed
into a time-frequency representation and is then processed to
obtain estimates of the acoustic intensity vectors at each time-
frequency bin. It is assumed that each time-frequency bin corresponds to a single plane wave and thus that the direction of
the acoustic intensity vector also represents the direction of that
plane wave. A diffuseness estimate is obtained for each time-
frequency bin using the ratio of the real part of the acoustic
intensity to the total energy. These parameters, along with the
sound pressure component obtained from the B-format recording, form the basis of the reproduction stage.
At the reproduction stage, direct and diffuse parts of the
signal are treated differently. For the direct part, azimuth
and elevation estimates in each time-frequency bin are used
to pan portions of the B-format omnidirectional component,
accordingly using VBAP. The diffuse part is reproduced by
generating multiple decorrelated copies of the recorded sound
played back from all the loudspeakers. The so-obtained channel impulse responses are then convolved with the desired
anechoic sound sample. A similar method, called the spatial
decomposition method (SDM), was recently proposed in [50].

IEEE Signal Processing Magazine

|

May 2017

|

43



Table of Contents for the Digital Edition of Signal Processing - May 2017

Signal Processing - May 2017 - Cover1
Signal Processing - May 2017 - Cover2
Signal Processing - May 2017 - 1
Signal Processing - May 2017 - 2
Signal Processing - May 2017 - 3
Signal Processing - May 2017 - 4
Signal Processing - May 2017 - 5
Signal Processing - May 2017 - 6
Signal Processing - May 2017 - 7
Signal Processing - May 2017 - 8
Signal Processing - May 2017 - 9
Signal Processing - May 2017 - 10
Signal Processing - May 2017 - 11
Signal Processing - May 2017 - 12
Signal Processing - May 2017 - 13
Signal Processing - May 2017 - 14
Signal Processing - May 2017 - 15
Signal Processing - May 2017 - 16
Signal Processing - May 2017 - 17
Signal Processing - May 2017 - 18
Signal Processing - May 2017 - 19
Signal Processing - May 2017 - 20
Signal Processing - May 2017 - 21
Signal Processing - May 2017 - 22
Signal Processing - May 2017 - 23
Signal Processing - May 2017 - 24
Signal Processing - May 2017 - 25
Signal Processing - May 2017 - 26
Signal Processing - May 2017 - 27
Signal Processing - May 2017 - 28
Signal Processing - May 2017 - 29
Signal Processing - May 2017 - 30
Signal Processing - May 2017 - 31
Signal Processing - May 2017 - 32
Signal Processing - May 2017 - 33
Signal Processing - May 2017 - 34
Signal Processing - May 2017 - 35
Signal Processing - May 2017 - 36
Signal Processing - May 2017 - 37
Signal Processing - May 2017 - 38
Signal Processing - May 2017 - 39
Signal Processing - May 2017 - 40
Signal Processing - May 2017 - 41
Signal Processing - May 2017 - 42
Signal Processing - May 2017 - 43
Signal Processing - May 2017 - 44
Signal Processing - May 2017 - 45
Signal Processing - May 2017 - 46
Signal Processing - May 2017 - 47
Signal Processing - May 2017 - 48
Signal Processing - May 2017 - 49
Signal Processing - May 2017 - 50
Signal Processing - May 2017 - 51
Signal Processing - May 2017 - 52
Signal Processing - May 2017 - 53
Signal Processing - May 2017 - 54
Signal Processing - May 2017 - 55
Signal Processing - May 2017 - 56
Signal Processing - May 2017 - 57
Signal Processing - May 2017 - 58
Signal Processing - May 2017 - 59
Signal Processing - May 2017 - 60
Signal Processing - May 2017 - 61
Signal Processing - May 2017 - 62
Signal Processing - May 2017 - 63
Signal Processing - May 2017 - 64
Signal Processing - May 2017 - 65
Signal Processing - May 2017 - 66
Signal Processing - May 2017 - 67
Signal Processing - May 2017 - 68
Signal Processing - May 2017 - 69
Signal Processing - May 2017 - 70
Signal Processing - May 2017 - 71
Signal Processing - May 2017 - 72
Signal Processing - May 2017 - 73
Signal Processing - May 2017 - 74
Signal Processing - May 2017 - 75
Signal Processing - May 2017 - 76
Signal Processing - May 2017 - 77
Signal Processing - May 2017 - 78
Signal Processing - May 2017 - 79
Signal Processing - May 2017 - 80
Signal Processing - May 2017 - 81
Signal Processing - May 2017 - 82
Signal Processing - May 2017 - 83
Signal Processing - May 2017 - 84
Signal Processing - May 2017 - 85
Signal Processing - May 2017 - 86
Signal Processing - May 2017 - 87
Signal Processing - May 2017 - 88
Signal Processing - May 2017 - 89
Signal Processing - May 2017 - 90
Signal Processing - May 2017 - 91
Signal Processing - May 2017 - 92
Signal Processing - May 2017 - 93
Signal Processing - May 2017 - 94
Signal Processing - May 2017 - 95
Signal Processing - May 2017 - 96
Signal Processing - May 2017 - 97
Signal Processing - May 2017 - 98
Signal Processing - May 2017 - 99
Signal Processing - May 2017 - 100
Signal Processing - May 2017 - 101
Signal Processing - May 2017 - 102
Signal Processing - May 2017 - 103
Signal Processing - May 2017 - 104
Signal Processing - May 2017 - 105
Signal Processing - May 2017 - 106
Signal Processing - May 2017 - 107
Signal Processing - May 2017 - 108
Signal Processing - May 2017 - 109
Signal Processing - May 2017 - 110
Signal Processing - May 2017 - 111
Signal Processing - May 2017 - 112
Signal Processing - May 2017 - Cover3
Signal Processing - May 2017 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201809
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201807
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201805
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201803
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_201801
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0917
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0717
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0517
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0317
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0117
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0916
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0716
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0516
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0316
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0116
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0915
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0715
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0515
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0315
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0115
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0914
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0714
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0514
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0314
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0114
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0913
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0713
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0513
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0313
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0113
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0912
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0712
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0512
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0312
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0112
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0911
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0711
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0511
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0311
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0111
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0910
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0710
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0510
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0310
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0110
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0909
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0709
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0509
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0309
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0109
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_1108
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0908
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0708
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0508
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0308
https://www.nxtbook.com/nxtbooks/ieee/signalprocessing_0108
https://www.nxtbookmedia.com