IEEE Robotics & Automation Magazine - September 2023 - 49

such cannot predict information that is not
already present in the input data set. The
remaining option is then RL, which is based
on a control loop (as shown in Figure 3),
where an actor predicts an action from the
current state, and this action is then given to
an environment, which updates the state
based on the action [15]. The action is determined
through a guided search-based method
[15], which determines the action that
maximizes a reward over time. In our case,
RL is an appealing solution as it naturally fits
into classical control-loop systems, needing
only an environment to experiment with different
actions, and a reward signal to determine
the optimal action for a given state [16].
At the time of this writing, the most stud "
UNFORTUNATELY,
DUE
THE NATURE
OF MODEL-FREE,
POLICY ITERATION
RL METHODS, THEY
ARE SUBOPTIMAL
IN CONVERGENCE
TIME WHEN COMPARED
WITH MODEL-FREE,
TEMPORAL
DIFFERENCE RL.
ied class of RL schemes is model-free temporal
difference RL, which updates its actor at
each iteration of the control loop and does not
require an existing system model to converge. However, these
procedures cannot be used in this article, due to
■ interstate differences being negligible at high frequencies
(i.e. control-loop frequencies higher than 1 Hz in this case
study), causing observation issues with the Markov decision
process on which temporal difference RL is based.
■ action delays induced by imperfect controllers and inertia,
resulting in dissociations among states, rewards, and
actions (i.e., an action taken at time T might affect the system
at time
Tn ).+
■ noise cancellation during action exploration at high frequencies,
causing the exploration to be nullified, preventing
the methods from converging.
This means that popular techniques such as the Deep
Deterministic Policy Gradient, Advantage Actor Critic, and
Proximal Policy Optimization cannot be used. Therefore, a
different class of RL approaches is applied in this article.
As such, the selected class of RL methods is model-free,
policy iteration RL [15], where the actor is updated only when
a task is completed or after a set number of iterations. These
schemes require an optimization algorithm to be applied to the
actor to converge toward an optimal solution. A few papers have
shown that these processes have similar performance when
compared to temporal difference RL methods [12]. More specifically,
this case study, which uses a neural network [11] as
an actor to predict gain because it has a continuous inputs and
outputs, is a universal function approximator that can be well
generalized to a given task and can be computed quickly. The
CMA-ES approach was the optimizer chosen [17] because it
is able to improve high-dimensional search spaces (i.e., higher
than 10); and nonlinear, nonconvex, and highly modal search
spaces, such as the ones encountered using neural networks.
Furthermore, the task itself is highly nonlinear and complex
due to dynamics at the wheel-ground interface. These characteristics
make neural networks with a CMA-ES optimizer an
ideal combination for determining a real-time, neural-based,
„
gain-estimation technique. Unfortunately, due
the nature of model-free, policy iteration RL
methods, they are suboptimal in convergence
time when compared with model-free, temporal
difference RL. This means that training the
actor must be done in simulation as the time
to convergence of the neural network might be
in the order of years of real-time experiments,
using multiple robots in parallel.
As such, the procedure has two phases: The
first is a training phase in simulation using
CMA-ES to determine an optimal neural network.
The second phase is where the trained
neural network, by itself, is defined as the gain
estimator model and is then used in real time
to predict optimal gain from the input state.
TRAINING OF THE NEURAL NETWORK
The CMA-ES method, guided by an obj function,
optimizes parameters of the neural network
(i.e., weights and biases), to determine a neural network
actor that can predict the optimal gains from a given input
state. The CMA-ES optimizer is able to train the actor-by
testing multiple candidates of neural networks in multiple
parallel environments-to determine a gradient vector, that
is, the neural network parameter space, that minimized the
obj function [17]. By employing this, the CMA-ES method
can move parameters of the neural network toward a local
optimum that minimizes the obj function, meaning that the
actor is capable of determining optimal gains using the given
input state, similar to gradient descent techniques [18].
The following obj function is used to train the neural network,
aiming at minimizing the lateral error and steering
angle along the trajectory to be followed:
N
obj
=+ d
=
|( )| .| ()|[ ],
05
n
@
1 / 6 yt Lt dtnF m
T n
(10)
where T is the total time, y is the lateral error, Fd is the front
steering angle, n is the current number of iterations, and N is
the total number of iterations. This obj function is similar to
the one used in previous works [5]; however, it does not
include the angular deviation factor as it prevented the CMAES
optimizer from reaching a more optimal solution. Furthermore,
the removal of this angular deviation from the obj
function did not increase oscillations, as previously thought.
Actor
Observation
ot
Reward
rt
Environment
FIGURE 3. The block diagram of a control loop using RL.
SEPTEMBER 2023 IEEE ROBOTICS & AUTOMATION MAGAZINE
49
Action
at

IEEE Robotics & Automation Magazine - September 2023

Table of Contents for the Digital Edition of IEEE Robotics & Automation Magazine - September 2023

Contents
IEEE Robotics & Automation Magazine - September 2023 - Cover1
IEEE Robotics & Automation Magazine - September 2023 - Cover2
IEEE Robotics & Automation Magazine - September 2023 - 1
IEEE Robotics & Automation Magazine - September 2023 - Contents
IEEE Robotics & Automation Magazine - September 2023 - 3
IEEE Robotics & Automation Magazine - September 2023 - 4
IEEE Robotics & Automation Magazine - September 2023 - 5
IEEE Robotics & Automation Magazine - September 2023 - 6
IEEE Robotics & Automation Magazine - September 2023 - 7
IEEE Robotics & Automation Magazine - September 2023 - 8
IEEE Robotics & Automation Magazine - September 2023 - 9
IEEE Robotics & Automation Magazine - September 2023 - 10
IEEE Robotics & Automation Magazine - September 2023 - 11
IEEE Robotics & Automation Magazine - September 2023 - 12
IEEE Robotics & Automation Magazine - September 2023 - 13
IEEE Robotics & Automation Magazine - September 2023 - 14
IEEE Robotics & Automation Magazine - September 2023 - 15
IEEE Robotics & Automation Magazine - September 2023 - 16
IEEE Robotics & Automation Magazine - September 2023 - 17
IEEE Robotics & Automation Magazine - September 2023 - 18
IEEE Robotics & Automation Magazine - September 2023 - 19
IEEE Robotics & Automation Magazine - September 2023 - 20
IEEE Robotics & Automation Magazine - September 2023 - 21
IEEE Robotics & Automation Magazine - September 2023 - 22
IEEE Robotics & Automation Magazine - September 2023 - 23
IEEE Robotics & Automation Magazine - September 2023 - 24
IEEE Robotics & Automation Magazine - September 2023 - 25
IEEE Robotics & Automation Magazine - September 2023 - 26
IEEE Robotics & Automation Magazine - September 2023 - 27
IEEE Robotics & Automation Magazine - September 2023 - 28
IEEE Robotics & Automation Magazine - September 2023 - 29
IEEE Robotics & Automation Magazine - September 2023 - 30
IEEE Robotics & Automation Magazine - September 2023 - 31
IEEE Robotics & Automation Magazine - September 2023 - 32
IEEE Robotics & Automation Magazine - September 2023 - 33
IEEE Robotics & Automation Magazine - September 2023 - 34
IEEE Robotics & Automation Magazine - September 2023 - 35
IEEE Robotics & Automation Magazine - September 2023 - 36
IEEE Robotics & Automation Magazine - September 2023 - 37
IEEE Robotics & Automation Magazine - September 2023 - 38
IEEE Robotics & Automation Magazine - September 2023 - 39
IEEE Robotics & Automation Magazine - September 2023 - 40
IEEE Robotics & Automation Magazine - September 2023 - 41
IEEE Robotics & Automation Magazine - September 2023 - 42
IEEE Robotics & Automation Magazine - September 2023 - 43
IEEE Robotics & Automation Magazine - September 2023 - 44
IEEE Robotics & Automation Magazine - September 2023 - 45
IEEE Robotics & Automation Magazine - September 2023 - 46
IEEE Robotics & Automation Magazine - September 2023 - 47
IEEE Robotics & Automation Magazine - September 2023 - 48
IEEE Robotics & Automation Magazine - September 2023 - 49
IEEE Robotics & Automation Magazine - September 2023 - 50
IEEE Robotics & Automation Magazine - September 2023 - 51
IEEE Robotics & Automation Magazine - September 2023 - 52
IEEE Robotics & Automation Magazine - September 2023 - 53
IEEE Robotics & Automation Magazine - September 2023 - 54
IEEE Robotics & Automation Magazine - September 2023 - 55
IEEE Robotics & Automation Magazine - September 2023 - 56
IEEE Robotics & Automation Magazine - September 2023 - 57
IEEE Robotics & Automation Magazine - September 2023 - 58
IEEE Robotics & Automation Magazine - September 2023 - 59
IEEE Robotics & Automation Magazine - September 2023 - 60
IEEE Robotics & Automation Magazine - September 2023 - 61
IEEE Robotics & Automation Magazine - September 2023 - 62
IEEE Robotics & Automation Magazine - September 2023 - 63
IEEE Robotics & Automation Magazine - September 2023 - 64
IEEE Robotics & Automation Magazine - September 2023 - 65
IEEE Robotics & Automation Magazine - September 2023 - 66
IEEE Robotics & Automation Magazine - September 2023 - 67
IEEE Robotics & Automation Magazine - September 2023 - 68
IEEE Robotics & Automation Magazine - September 2023 - 69
IEEE Robotics & Automation Magazine - September 2023 - 70
IEEE Robotics & Automation Magazine - September 2023 - 71
IEEE Robotics & Automation Magazine - September 2023 - 72
IEEE Robotics & Automation Magazine - September 2023 - 73
IEEE Robotics & Automation Magazine - September 2023 - 74
IEEE Robotics & Automation Magazine - September 2023 - 75
IEEE Robotics & Automation Magazine - September 2023 - 76
IEEE Robotics & Automation Magazine - September 2023 - 77
IEEE Robotics & Automation Magazine - September 2023 - 78
IEEE Robotics & Automation Magazine - September 2023 - 79
IEEE Robotics & Automation Magazine - September 2023 - 80
IEEE Robotics & Automation Magazine - September 2023 - 81
IEEE Robotics & Automation Magazine - September 2023 - 82
IEEE Robotics & Automation Magazine - September 2023 - 83
IEEE Robotics & Automation Magazine - September 2023 - 84
IEEE Robotics & Automation Magazine - September 2023 - 85
IEEE Robotics & Automation Magazine - September 2023 - 86
IEEE Robotics & Automation Magazine - September 2023 - 87
IEEE Robotics & Automation Magazine - September 2023 - 88
IEEE Robotics & Automation Magazine - September 2023 - 89
IEEE Robotics & Automation Magazine - September 2023 - 90
IEEE Robotics & Automation Magazine - September 2023 - 91
IEEE Robotics & Automation Magazine - September 2023 - 92
IEEE Robotics & Automation Magazine - September 2023 - 93
IEEE Robotics & Automation Magazine - September 2023 - 94
IEEE Robotics & Automation Magazine - September 2023 - 95
IEEE Robotics & Automation Magazine - September 2023 - 96
IEEE Robotics & Automation Magazine - September 2023 - 97
IEEE Robotics & Automation Magazine - September 2023 - 98
IEEE Robotics & Automation Magazine - September 2023 - 99
IEEE Robotics & Automation Magazine - September 2023 - 100
IEEE Robotics & Automation Magazine - September 2023 - 101
IEEE Robotics & Automation Magazine - September 2023 - 102
IEEE Robotics & Automation Magazine - September 2023 - 103
IEEE Robotics & Automation Magazine - September 2023 - 104
IEEE Robotics & Automation Magazine - September 2023 - 105
IEEE Robotics & Automation Magazine - September 2023 - 106
IEEE Robotics & Automation Magazine - September 2023 - 107
IEEE Robotics & Automation Magazine - September 2023 - 108
IEEE Robotics & Automation Magazine - September 2023 - 109
IEEE Robotics & Automation Magazine - September 2023 - 110
IEEE Robotics & Automation Magazine - September 2023 - 111
IEEE Robotics & Automation Magazine - September 2023 - 112
IEEE Robotics & Automation Magazine - September 2023 - 113
IEEE Robotics & Automation Magazine - September 2023 - 114
IEEE Robotics & Automation Magazine - September 2023 - 115
IEEE Robotics & Automation Magazine - September 2023 - 116
IEEE Robotics & Automation Magazine - September 2023 - 117
IEEE Robotics & Automation Magazine - September 2023 - 118
IEEE Robotics & Automation Magazine - September 2023 - 119
IEEE Robotics & Automation Magazine - September 2023 - 120
IEEE Robotics & Automation Magazine - September 2023 - 121
IEEE Robotics & Automation Magazine - September 2023 - 122
IEEE Robotics & Automation Magazine - September 2023 - 123
IEEE Robotics & Automation Magazine - September 2023 - 124
IEEE Robotics & Automation Magazine - September 2023 - 125
IEEE Robotics & Automation Magazine - September 2023 - 126
IEEE Robotics & Automation Magazine - September 2023 - 127
IEEE Robotics & Automation Magazine - September 2023 - 128
IEEE Robotics & Automation Magazine - September 2023 - 129
IEEE Robotics & Automation Magazine - September 2023 - 130
IEEE Robotics & Automation Magazine - September 2023 - 131
IEEE Robotics & Automation Magazine - September 2023 - 132
IEEE Robotics & Automation Magazine - September 2023 - 133
IEEE Robotics & Automation Magazine - September 2023 - 134
IEEE Robotics & Automation Magazine - September 2023 - 135
IEEE Robotics & Automation Magazine - September 2023 - 136
IEEE Robotics & Automation Magazine - September 2023 - 137
IEEE Robotics & Automation Magazine - September 2023 - 138
IEEE Robotics & Automation Magazine - September 2023 - 139
IEEE Robotics & Automation Magazine - September 2023 - 140
IEEE Robotics & Automation Magazine - September 2023 - 141
IEEE Robotics & Automation Magazine - September 2023 - 142
IEEE Robotics & Automation Magazine - September 2023 - 143
IEEE Robotics & Automation Magazine - September 2023 - 144
IEEE Robotics & Automation Magazine - September 2023 - 145
IEEE Robotics & Automation Magazine - September 2023 - 146
IEEE Robotics & Automation Magazine - September 2023 - 147
IEEE Robotics & Automation Magazine - September 2023 - 148
IEEE Robotics & Automation Magazine - September 2023 - 149
IEEE Robotics & Automation Magazine - September 2023 - 150
IEEE Robotics & Automation Magazine - September 2023 - 151
IEEE Robotics & Automation Magazine - September 2023 - 152
IEEE Robotics & Automation Magazine - September 2023 - 153
IEEE Robotics & Automation Magazine - September 2023 - 154
IEEE Robotics & Automation Magazine - September 2023 - 155
IEEE Robotics & Automation Magazine - September 2023 - 156
IEEE Robotics & Automation Magazine - September 2023 - 157
IEEE Robotics & Automation Magazine - September 2023 - 158
IEEE Robotics & Automation Magazine - September 2023 - 159
IEEE Robotics & Automation Magazine - September 2023 - 160
IEEE Robotics & Automation Magazine - September 2023 - 161
IEEE Robotics & Automation Magazine - September 2023 - 162
IEEE Robotics & Automation Magazine - September 2023 - 163
IEEE Robotics & Automation Magazine - September 2023 - 164
IEEE Robotics & Automation Magazine - September 2023 - 165
IEEE Robotics & Automation Magazine - September 2023 - 166
IEEE Robotics & Automation Magazine - September 2023 - 167
IEEE Robotics & Automation Magazine - September 2023 - 168
IEEE Robotics & Automation Magazine - September 2023 - Cover3
IEEE Robotics & Automation Magazine - September 2023 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2023
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2023
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2023
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2023
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2022
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2022
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2022
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2022
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2021
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2021
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2021
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2021
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2020
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2020
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2020
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2020
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2019
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2019
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2019
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2019
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2018
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2018
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2018
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2018
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2017
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2017
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2017
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2017
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2016
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2016
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2016
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2016
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2015
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2015
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2015
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2015
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2014
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2014
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2014
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2014
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2013
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2013
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2013
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2013
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2012
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2012
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2012
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2012
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2011
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2011
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2011
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2011
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2010
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2010
https://www.nxtbookmedia.com