IEEE Computational Intelligence Magazine - August 2018 - 53

pendulum within the assigned number
of trials. The structures of goal, critic,
and action networks established for this
case refer to Fig. 2 with the number of
neurons in each layer as 9-14-1, (i.e., the
neural network has 9 input nodes, 14
hidden nodes, and 1 output node),
10-16-1, and 8-14-1, respectively. Since
the optimal equilibrium of all states are
near zero, we define the mathematical
representation of the ultimate goal, in
this example, as U c = 0.
Our simulation demonstrated that
92% of the runs resulted in a successful
balance. The average number of trials
to success was 1071.6. Table 1 shows
the comparative results of our method
with respect to those reported in [18]
and [23].
Table 1 shows that the performance of the proposed self-learning
ADP method is not as good as the
perfor mance of the existing ADP
methods with pre-defined reward signal. This is because the agent in our
approach needs to learn what the
reward signal is. However, the key observation from this research is that, the
learning process of the proposed method can be accomplished without the
explicit external reward directly from
the environment. Instead, this reward
will be automatically and adaptively
learned and developed by the goal network according to the ultimate goal.
This is the key fundamental contribution of this article.
To further examine the performance
of the self-learning ADP method, a typical trajectory for each of the state variables for the task is shown in Fig. 5,
including (a) the position of the cart on
the track; (b)-(d) the vertical angle of
the 1st, 2nd and 3rd links of the pendulum, respectively; (e) the velocity of the
cart; and (f )-(h) the angular velocity of
the 1st, 2nd and 3rd links of the pendulum, respectively. We observe that the
cart position on the track and all the
joint angles of the links are balanced
within a small range of the balance
point. This indicates that the proposed
method can effectively control the system to achieve desired performance
and the controller can estimate the

The learning process of the proposed method can be
accomplished without the explicit external reward
directly from the environment. Instead, this reward will
be automatically and adaptively learned and developed
by the goal network according to the ultimate goal.

effect of an action during the learning
process automatically.
V. Summary and Conclusion

We have designed a self-learning method without the explicit external reward
signal directly given by the environment. Comparing with the traditional
RL/ADP methods in the literature, the
key contribution of our approach is that
we introduce a new goal network to
automatically and adaptively develop an
internal reward signal based on the ultimate goal to facilitate the self-learning
process. Therefore, instead of receiving

an explicit reward signal directly from
the external environment, the agent
itself can learn an internal reward signal
s (t) by the goal network according to
the ultimate goal U c and guide itself to
accomplish the task. This also means in
our approach, only two interaction elements, state x (t) and action a (t), are
required at each time step during the
learning process. From simulations, we
observe that the success rate of the
designed self-learning method is lower
than that of the traditional ADP methods. This is because the agent in the
proposed method needs to learn how to

State
x (t )

Environment

Agent

θ3

s (t )
Internal
Reward Signal

θ2

θ1

Self-Learning
Force
a (t )

Cart

Figure 4 Triple-link inverted pendulum case and its interaction with the agent.

TAbLe 1 Comparison with existing ADP learning algorithms.

SUCCESS RATE

SeLF-LeArNiNg
ADP

TrADiTiONAL
ADP [18]

gOAL rePreSeNTATiON ADP [23]

92%

97%

99%

NUMBER OF TRAILS

1071.6

1194

571.4

NEED EXTERNAL REWARDS

NO

YES

YES

auguSt 2018 | IEEE ComputatIonal IntEllIgEnCE magazInE

53



Table of Contents for the Digital Edition of IEEE Computational Intelligence Magazine - August 2018

Contents
IEEE Computational Intelligence Magazine - August 2018 - Cover1
IEEE Computational Intelligence Magazine - August 2018 - Cover2
IEEE Computational Intelligence Magazine - August 2018 - Contents
IEEE Computational Intelligence Magazine - August 2018 - 2
IEEE Computational Intelligence Magazine - August 2018 - 3
IEEE Computational Intelligence Magazine - August 2018 - 4
IEEE Computational Intelligence Magazine - August 2018 - 5
IEEE Computational Intelligence Magazine - August 2018 - 6
IEEE Computational Intelligence Magazine - August 2018 - 7
IEEE Computational Intelligence Magazine - August 2018 - 8
IEEE Computational Intelligence Magazine - August 2018 - 9
IEEE Computational Intelligence Magazine - August 2018 - 10
IEEE Computational Intelligence Magazine - August 2018 - 11
IEEE Computational Intelligence Magazine - August 2018 - 12
IEEE Computational Intelligence Magazine - August 2018 - 13
IEEE Computational Intelligence Magazine - August 2018 - 14
IEEE Computational Intelligence Magazine - August 2018 - 15
IEEE Computational Intelligence Magazine - August 2018 - 16
IEEE Computational Intelligence Magazine - August 2018 - 17
IEEE Computational Intelligence Magazine - August 2018 - 18
IEEE Computational Intelligence Magazine - August 2018 - 19
IEEE Computational Intelligence Magazine - August 2018 - 20
IEEE Computational Intelligence Magazine - August 2018 - 21
IEEE Computational Intelligence Magazine - August 2018 - 22
IEEE Computational Intelligence Magazine - August 2018 - 23
IEEE Computational Intelligence Magazine - August 2018 - 24
IEEE Computational Intelligence Magazine - August 2018 - 25
IEEE Computational Intelligence Magazine - August 2018 - 26
IEEE Computational Intelligence Magazine - August 2018 - 27
IEEE Computational Intelligence Magazine - August 2018 - 28
IEEE Computational Intelligence Magazine - August 2018 - 29
IEEE Computational Intelligence Magazine - August 2018 - 30
IEEE Computational Intelligence Magazine - August 2018 - 31
IEEE Computational Intelligence Magazine - August 2018 - 32
IEEE Computational Intelligence Magazine - August 2018 - 33
IEEE Computational Intelligence Magazine - August 2018 - 34
IEEE Computational Intelligence Magazine - August 2018 - 35
IEEE Computational Intelligence Magazine - August 2018 - 36
IEEE Computational Intelligence Magazine - August 2018 - 37
IEEE Computational Intelligence Magazine - August 2018 - 38
IEEE Computational Intelligence Magazine - August 2018 - 39
IEEE Computational Intelligence Magazine - August 2018 - 40
IEEE Computational Intelligence Magazine - August 2018 - 41
IEEE Computational Intelligence Magazine - August 2018 - 42
IEEE Computational Intelligence Magazine - August 2018 - 43
IEEE Computational Intelligence Magazine - August 2018 - 44
IEEE Computational Intelligence Magazine - August 2018 - 45
IEEE Computational Intelligence Magazine - August 2018 - 46
IEEE Computational Intelligence Magazine - August 2018 - 47
IEEE Computational Intelligence Magazine - August 2018 - 48
IEEE Computational Intelligence Magazine - August 2018 - 49
IEEE Computational Intelligence Magazine - August 2018 - 50
IEEE Computational Intelligence Magazine - August 2018 - 51
IEEE Computational Intelligence Magazine - August 2018 - 52
IEEE Computational Intelligence Magazine - August 2018 - 53
IEEE Computational Intelligence Magazine - August 2018 - 54
IEEE Computational Intelligence Magazine - August 2018 - 55
IEEE Computational Intelligence Magazine - August 2018 - 56
IEEE Computational Intelligence Magazine - August 2018 - 57
IEEE Computational Intelligence Magazine - August 2018 - 58
IEEE Computational Intelligence Magazine - August 2018 - 59
IEEE Computational Intelligence Magazine - August 2018 - 60
IEEE Computational Intelligence Magazine - August 2018 - 61
IEEE Computational Intelligence Magazine - August 2018 - 62
IEEE Computational Intelligence Magazine - August 2018 - 63
IEEE Computational Intelligence Magazine - August 2018 - 64
IEEE Computational Intelligence Magazine - August 2018 - 65
IEEE Computational Intelligence Magazine - August 2018 - 66
IEEE Computational Intelligence Magazine - August 2018 - 67
IEEE Computational Intelligence Magazine - August 2018 - 68
IEEE Computational Intelligence Magazine - August 2018 - 69
IEEE Computational Intelligence Magazine - August 2018 - 70
IEEE Computational Intelligence Magazine - August 2018 - 71
IEEE Computational Intelligence Magazine - August 2018 - 72
IEEE Computational Intelligence Magazine - August 2018 - 73
IEEE Computational Intelligence Magazine - August 2018 - 74
IEEE Computational Intelligence Magazine - August 2018 - 75
IEEE Computational Intelligence Magazine - August 2018 - 76
IEEE Computational Intelligence Magazine - August 2018 - Cover3
IEEE Computational Intelligence Magazine - August 2018 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202311
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202308
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202305
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202302
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202211
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202208
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202205
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202202
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202111
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202108
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202105
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202102
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202011
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202008
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202005
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202002
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201911
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201908
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201905
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201902
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201811
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201808
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201805
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201802
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter12
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall12
https://www.nxtbookmedia.com