IEEE Systems, Man and Cybernetics Magazine - April 2023 - 35

activity when the CA module
feeds a synchronized intermodality
cue. The SA module does
not demonstrate any performance
gain on unsynchronized
inputs because it has difficulties
focusing on the important
features from the unsynchronized
sequence due to inconsistencies
between the audio and
visual. As a result, the SA module
did not exhibit any performance
gain on unsynchronized
inputs. However, in terms of
recall, the SA module improves
the model performance to
distinguish when the robot is
addressed. The fact that hu -
mans speak to the robot in a
loud voice helps the SA module
to focus on important features to distinguish when the
robot is addressed in terms of recall.
The experiment
indicates that the
CA effectiveness
in capturing the
intermodality
cues and the SA
effectiveness in
finding the long-term
cues help the model
gain significant
performance.
these features to find the important
feature and long-term speaking
evidence would improve model
prediction performance.
Comparison With
Previous Works
As shown in Table 1, we compared
our proposed model with previous
works from different perspectives,
such as the employed
modalities, environment settings,
proposed approach, dataset, and
performance.
The deep learning approach
pro posed by Minh et al. [2] has two
main limitations compared to our
work. First, the dataset proposed
by Minh et al. [2] was artificially
generated from a static image.
◆ Without SA: Without the SA module, the CA module
helped the model to improve the prediction performance
in terms of precision, recall, TNR, Fm, and Bal.
Accu. The CA module helped the framework to synchronize
intermodality cues, leading to performance
gain. The experiment also reveals that the synchronized
intermodality feature of the CA module makes a
significant contribution in improving the model performance
compared to the SA module.
◆ With versus without CA and SA: Without the two
attention modules, ADNet attains a 65% Bal.Accu.
However, utilizing these two modules, the model
attained a 71% Bal.Accu. The experiment indicates that
the CA effectiveness in capturing the intermodality
cues and the SA effectiveness in finding the long-term
cues help the model gain significant performance. We
proved the hypothesis that synchronized audiovisual
features generated by a CA mechanism capturing
intermodality cues, and leveraging the SA module on
Table 5. BLF versus concatenation.
ADNet
Evaluation Metrics
Precision
Recall/TPR
TNR
Fm
Bal.Accu
StoR
86.36
79.56
62.82
82.82
71
StoR: speaking to robot; StoS: speaking to another subject.
StoS
50.94
62.82
79.56
56.26
However, we cannot tell if a person addresses another person
by simply extracting features from a single image. Second,
audio data were not used as one of the most crucial
input modalities to predict the addressee. In short, their
approach is not suitable leveraging in for real-world applications.
Ours overcame these two challenges by proposing
a multimodal dataset recorded in realistic scenarios and a
framework that leverages crucial audio and visual communication
cues.
Regarding the rule-based [11], [12], [13] and statisticsbased
[12] approaches, their accuracies indicate that they
are performing well on a given dataset or in specific settings.
These approaches fail to distinguish the addressee
in new settings due to poor generalization. However, the
proposed approach is far better than these works since it
utilizes the deep learning approach to extract features and
is tested on unseen datasets during training. Furthermore,
all of these works were conducted in human-to-human settings
except [30] and did not utilize multimodal inputs.
The work of Baba et al. [30], proposed in a mixed
setting, used a very small dataset with 20 features (six
ADNet_concat
StoR
84.58
88.45
52.27
86.47
StoS
60.46
52.27
88.45
56.07
70
Evaluation
Metrics
Precision
Recall/TPR
TNR
Fm
Bal.Accu
Table 6. Ablation experiments of the
CA and SA modules on the E-MuMMER
test set.
Without Both
StoR
83.72
69.05
60.28
75.68
65
StoS
39.69
60.28
69.05
47.87
Wo_SA
Wo_CA
StoR StoS StoR StoS
83.33 53.5 81.49 58.87
85.5
49.4
49.4
90.8 38.99
85.5 38.99 90.8
84.4 51.37 85.89 46.9
67
65
April 2023 IEEE SYSTEMS, MAN, & CYBERNETICS MAGAZINE 35

IEEE Systems, Man and Cybernetics Magazine - April 2023

Table of Contents for the Digital Edition of IEEE Systems, Man and Cybernetics Magazine - April 2023

IEEE Systems, Man and Cybernetics Magazine - April 2023 - Cover1
IEEE Systems, Man and Cybernetics Magazine - April 2023 - Cover2
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 1
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 2
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 3
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 4
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 5
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 6
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 7
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 8
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 9
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 10
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 11
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 12
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 13
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 14
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 15
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 16
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 17
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 18
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 19
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 20
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 21
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 22
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 23
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 24
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 25
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 26
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 27
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 28
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 29
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 30
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 31
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 32
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 33
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 34
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 35
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 36
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 37
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 38
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 39
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 40
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 41
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 42
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 43
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 44
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 45
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 46
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 47
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 48
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 49
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 50
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 51
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 52
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 53
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 54
IEEE Systems, Man and Cybernetics Magazine - April 2023 - Cover3
IEEE Systems, Man and Cybernetics Magazine - April 2023 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/smc_202310
https://www.nxtbook.com/nxtbooks/ieee/smc_202307
https://www.nxtbook.com/nxtbooks/ieee/smc_202304
https://www.nxtbook.com/nxtbooks/ieee/smc_202301
https://www.nxtbook.com/nxtbooks/ieee/smc_202210
https://www.nxtbook.com/nxtbooks/ieee/smc_202207
https://www.nxtbook.com/nxtbooks/ieee/smc_202204
https://www.nxtbook.com/nxtbooks/ieee/smc_202201
https://www.nxtbook.com/nxtbooks/ieee/smc_202110
https://www.nxtbook.com/nxtbooks/ieee/smc_202107
https://www.nxtbook.com/nxtbooks/ieee/smc_202104
https://www.nxtbook.com/nxtbooks/ieee/smc_202101
https://www.nxtbook.com/nxtbooks/ieee/smc_202010
https://www.nxtbook.com/nxtbooks/ieee/smc_202007
https://www.nxtbook.com/nxtbooks/ieee/smc_202004
https://www.nxtbook.com/nxtbooks/ieee/smc_202001
https://www.nxtbook.com/nxtbooks/ieee/smc_201910
https://www.nxtbook.com/nxtbooks/ieee/smc_201907
https://www.nxtbook.com/nxtbooks/ieee/smc_201904
https://www.nxtbook.com/nxtbooks/ieee/smc_201901
https://www.nxtbook.com/nxtbooks/ieee/smc_201810
https://www.nxtbook.com/nxtbooks/ieee/smc_201807
https://www.nxtbook.com/nxtbooks/ieee/smc_201804
https://www.nxtbook.com/nxtbooks/ieee/smc_201801
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_1017
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0717
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0417
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0117
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_1016
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0716
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0416
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0116
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_1015
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0715
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0415
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0115
https://www.nxtbookmedia.com