IEEE Computational Intelligence Magazine - November 2021 - 61
Qs a (, )l = A
Qs a(, )
(13)
/ (, )Qs a
a =1
Intuitively, if the Q-values for all actions in s are similar, it
does not matter which one is chosen, and s is unimportant.
However, if certain actions in s have significantly higher Q-values
than others, s becomes more important as it is necessary to
choose the most appropriate action in s to perform. Typically,
the larger the Imp(s) value, the less important and more uncertain
the current state. To decide whether a state is important or
uncertain, the threshold T is introduced to control the degree
of the state importance. As
Imps
() [, ],01 T 01!! For sim[,
].
plicity,T .05= is set in the present study.
In theory, there are three ways in which the imitation process
occurs, namely forward (student-initiated), backward
(teacher-initiated), and bidirectional (jointly-initiated) imitation.
The three following subsections provide details of the
three imitation strategies.
1) Forward Imitation
Considering that agents learn from scratch and have insufficient
awareness of the state importance, especially at the early learning
stage, it usually takes a large number of learning trials before a
student agent can initiate imitation based on the importance of
the state. Therefore, a new way for a student agent to decide
whether to initiate an imitation process based on its estimation
of the uncertainty of a given state is developed.
Specifically, the given state s is considered to be uncertain
when Imp(s) is larger than the given threshold T. In this case,
there is little difference in return for the actions the agent takes
in this state. When the following calculation equation is satisfied,
the student initiates the imitation process.
A
/ ll 2
a =1
QsaQsa
studentstudent
# 1 -
(, )( (, )) T
(14)
2) Backward Imitation
Different from forward imitation where a student agent initiates
the imitation based on the uncertainty of a given state, in
backward imitation, a teacher agent who has learned effective
strategies for accomplishing a task shall be able to reasonably
judge the importance of the state and subsequently provide
guidance for conducting the imitation process.
When a teacher agent considers a state to be important, the estimated
reward for performing certain actions from the action space
will be obviously higher than the rewards for other actions. Therefore,
the teacher initiates imitation in the hope that the student will
learn by imitating more appropriate actions. This reduces the likelihood
that the student agent will take the wrong action in an
important state, which would lead to significant reductions in future
rewards. The teacher, who is aware of the state importance, can only
initiate imitation when the following condition is satisfied.
A
/ ll 1
a =1
QsaQsa
teacherteacher
# 1 -
(, )( (, )) T
(15)
3) Bidirectional Imitation
Intuitively, an imitation interaction involves both student and
teacher agents. Hence, this process can be initiated not only by
the student agent but also by the teacher agent. Note that forward
imitation is likely to suffer from the inaccurate Q-value
estimation of the student agent, especially at the early learning
stage when students tend to explore the environment with
high randomness. In backward imitation, the teacher agent is
expected to constantly monitor students' states, which
demands a large amount of communication cost and is impossible
in reality.
A bidirectional imitation method is introduced where the
teacher and student agents can jointly decide whether to
initiate an imitation interaction (see Algorithm 3 for a general
illustration). First, the student agent decides whether to
make an imitation request based on the uncertainty of the
given state (see line 6 in Algorithm 3). Upon receiving the
request, the teacher agent then estimates the importance
of the state and decides whether to make a response and
guide the learning process (see line 10 in Algorithm 3).
By doing this, the teacher agent is not required to pay
constant attention to the students' states, and the impact
of the detrimental or unnecessary imitation incurred by the
noisy Q-estimation is reduced significantly. Specifically, bidirectional
imitation occurs only when a given state is considered
to be uncertain by the student agent and important by
the teacher agent, as defined below.
imitation = '
yes
no
,
,
Imps TImp sT
other
studentteacher
()21/
()
.
(16)
Algorithm 3 Bidirectional Imitation.
1 Initialization: N agents; threshold T.
2 while stop conditions are not satisfied do
3
4
5
6
7
8
9
for 6 agent as a student do
Given the current state s.
Calculate Imps
if Imps T
student ()
2 Equation 12.
student () 2 (uncertain) then
10
11
12
13
14
15
16
17
18
19
20 end
Identify a teacher via meme selection.
Pass state s to the teacher agent.
Calculate Imps
if Imps T
teacher
()
teacher () 1 (important) then
Perform meme transmission.
Pass socitype actions to the student.
else
Perform meme internal evolution.
end
else
Perform meme internal evolution.
end
end
2 Equation 12.
NOVEMBER 2021 | IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE 61
IEEE Computational Intelligence Magazine - November 2021
Table of Contents for the Digital Edition of IEEE Computational Intelligence Magazine - November 2021
IEEE Computational Intelligence Magazine - November 2021 - Cover1
IEEE Computational Intelligence Magazine - November 2021 - Cover2
IEEE Computational Intelligence Magazine - November 2021 - 1
IEEE Computational Intelligence Magazine - November 2021 - 2
IEEE Computational Intelligence Magazine - November 2021 - 3
IEEE Computational Intelligence Magazine - November 2021 - 4
IEEE Computational Intelligence Magazine - November 2021 - 5
IEEE Computational Intelligence Magazine - November 2021 - 6
IEEE Computational Intelligence Magazine - November 2021 - 7
IEEE Computational Intelligence Magazine - November 2021 - 8
IEEE Computational Intelligence Magazine - November 2021 - 9
IEEE Computational Intelligence Magazine - November 2021 - 10
IEEE Computational Intelligence Magazine - November 2021 - 11
IEEE Computational Intelligence Magazine - November 2021 - 12
IEEE Computational Intelligence Magazine - November 2021 - 13
IEEE Computational Intelligence Magazine - November 2021 - 14
IEEE Computational Intelligence Magazine - November 2021 - 15
IEEE Computational Intelligence Magazine - November 2021 - 16
IEEE Computational Intelligence Magazine - November 2021 - 17
IEEE Computational Intelligence Magazine - November 2021 - 18
IEEE Computational Intelligence Magazine - November 2021 - 19
IEEE Computational Intelligence Magazine - November 2021 - 20
IEEE Computational Intelligence Magazine - November 2021 - 21
IEEE Computational Intelligence Magazine - November 2021 - 22
IEEE Computational Intelligence Magazine - November 2021 - 23
IEEE Computational Intelligence Magazine - November 2021 - 24
IEEE Computational Intelligence Magazine - November 2021 - 25
IEEE Computational Intelligence Magazine - November 2021 - 26
IEEE Computational Intelligence Magazine - November 2021 - 27
IEEE Computational Intelligence Magazine - November 2021 - 28
IEEE Computational Intelligence Magazine - November 2021 - 29
IEEE Computational Intelligence Magazine - November 2021 - 30
IEEE Computational Intelligence Magazine - November 2021 - 31
IEEE Computational Intelligence Magazine - November 2021 - 32
IEEE Computational Intelligence Magazine - November 2021 - 33
IEEE Computational Intelligence Magazine - November 2021 - 34
IEEE Computational Intelligence Magazine - November 2021 - 35
IEEE Computational Intelligence Magazine - November 2021 - 36
IEEE Computational Intelligence Magazine - November 2021 - 37
IEEE Computational Intelligence Magazine - November 2021 - 38
IEEE Computational Intelligence Magazine - November 2021 - 39
IEEE Computational Intelligence Magazine - November 2021 - 40
IEEE Computational Intelligence Magazine - November 2021 - 41
IEEE Computational Intelligence Magazine - November 2021 - 42
IEEE Computational Intelligence Magazine - November 2021 - 43
IEEE Computational Intelligence Magazine - November 2021 - 44
IEEE Computational Intelligence Magazine - November 2021 - 45
IEEE Computational Intelligence Magazine - November 2021 - 46
IEEE Computational Intelligence Magazine - November 2021 - 47
IEEE Computational Intelligence Magazine - November 2021 - 48
IEEE Computational Intelligence Magazine - November 2021 - 49
IEEE Computational Intelligence Magazine - November 2021 - 50
IEEE Computational Intelligence Magazine - November 2021 - 51
IEEE Computational Intelligence Magazine - November 2021 - 52
IEEE Computational Intelligence Magazine - November 2021 - 53
IEEE Computational Intelligence Magazine - November 2021 - 54
IEEE Computational Intelligence Magazine - November 2021 - 55
IEEE Computational Intelligence Magazine - November 2021 - 56
IEEE Computational Intelligence Magazine - November 2021 - 57
IEEE Computational Intelligence Magazine - November 2021 - 58
IEEE Computational Intelligence Magazine - November 2021 - 59
IEEE Computational Intelligence Magazine - November 2021 - 60
IEEE Computational Intelligence Magazine - November 2021 - 61
IEEE Computational Intelligence Magazine - November 2021 - 62
IEEE Computational Intelligence Magazine - November 2021 - 63
IEEE Computational Intelligence Magazine - November 2021 - 64
IEEE Computational Intelligence Magazine - November 2021 - 65
IEEE Computational Intelligence Magazine - November 2021 - 66
IEEE Computational Intelligence Magazine - November 2021 - 67
IEEE Computational Intelligence Magazine - November 2021 - 68
IEEE Computational Intelligence Magazine - November 2021 - 69
IEEE Computational Intelligence Magazine - November 2021 - 70
IEEE Computational Intelligence Magazine - November 2021 - 71
IEEE Computational Intelligence Magazine - November 2021 - 72
IEEE Computational Intelligence Magazine - November 2021 - 73
IEEE Computational Intelligence Magazine - November 2021 - 74
IEEE Computational Intelligence Magazine - November 2021 - 75
IEEE Computational Intelligence Magazine - November 2021 - 76
IEEE Computational Intelligence Magazine - November 2021 - 77
IEEE Computational Intelligence Magazine - November 2021 - 78
IEEE Computational Intelligence Magazine - November 2021 - 79
IEEE Computational Intelligence Magazine - November 2021 - 80
IEEE Computational Intelligence Magazine - November 2021 - 81
IEEE Computational Intelligence Magazine - November 2021 - 82
IEEE Computational Intelligence Magazine - November 2021 - 83
IEEE Computational Intelligence Magazine - November 2021 - 84
IEEE Computational Intelligence Magazine - November 2021 - 85
IEEE Computational Intelligence Magazine - November 2021 - 86
IEEE Computational Intelligence Magazine - November 2021 - 87
IEEE Computational Intelligence Magazine - November 2021 - 88
IEEE Computational Intelligence Magazine - November 2021 - 89
IEEE Computational Intelligence Magazine - November 2021 - 90
IEEE Computational Intelligence Magazine - November 2021 - 91
IEEE Computational Intelligence Magazine - November 2021 - 92
IEEE Computational Intelligence Magazine - November 2021 - 93
IEEE Computational Intelligence Magazine - November 2021 - 94
IEEE Computational Intelligence Magazine - November 2021 - 95
IEEE Computational Intelligence Magazine - November 2021 - 96
IEEE Computational Intelligence Magazine - November 2021 - 97
IEEE Computational Intelligence Magazine - November 2021 - 98
IEEE Computational Intelligence Magazine - November 2021 - 99
IEEE Computational Intelligence Magazine - November 2021 - 100
IEEE Computational Intelligence Magazine - November 2021 - Cover3
IEEE Computational Intelligence Magazine - November 2021 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202311
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202308
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202305
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202302
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202211
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202208
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202205
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202202
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202111
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202108
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202105
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202102
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202011
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202008
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202005
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202002
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201911
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201908
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201905
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201902
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201811
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201808
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201805
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201802
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter12
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall12
https://www.nxtbookmedia.com