IEEE Robotics & Automation Magazine - June 2023 - 90
of tumbling motion. With too low of a rate, simulated joint
control was inconsistent at low velocities. The rate, number
of simulation steps per episode step, and number of steps per
episode were empirically chosen to result
in fast training times, a policy that converges,
and consistent simulator behavior.
Measurements of the physical prototype
were translated into simulation through
URDF models, similar to the process
described in [9]. We segmented the structure
of the robot into three parts: the body and
two legs. For modeling inertial characteristics,
calculations were done assuming uniform
density, using the measured mass and
dimensions. A high friction coefficient, 0.9
for lateral friction, was chosen empirically
to minimize behavior such as sliding that
would be unlikely to transfer well [10]. As
described in the " Hardware " section, these
values were used as the means for randomized
generated URDF models. This randomization
affects the dynamics of the tumbling
robot's movement.
The tumbling bot's servo controllers do not provide consisthe
LSTM achieved slightly higher rewards at times. We
also tested different RL algorithms,
"
including Advantage
IT IS OFTEN THE CASE
THAT REAL-WORLD ENVIRONMENTS
INCLUDE
COMPLEX TERRAINS
COMPOSED OF VARYING
SURFACES, RAPID
CHANGES IN ELEVATION,
AND IMPASSABLE
OBSTACLES.
„
tent torque. At near-zero velocities, the torque is significantly
reduced. This behavior is approximated in simulation by setting
the motor's force to 0.1 N with Gaussian noise, where
the standard deviation is 0.1 N. At higher velocities, the servo
torque is set to 10 N. The servo velocity was also randomized,
with a standard deviation of 0.8 rad/s. This randomization is
significant relative to the commanded velocities, but it was
essential to having a robust policy.
RL
We used the Stable Baselines implementation of PPO with a
multilayer perceptron (MLP) LSTM policy. We set the
learning rate to 3e-4, the same that was used in [7]. Nminibatches
was set to one. For the hyperparameter selection, we
used the same ones proposed in [12], due to high performance
in their testing/evaluation. A selection of values is
available in Table 1.
Table 2 shows initial results that support our choice of a
recurrent policy. Both policies performed similarly, although
TABLE 1. The PPO2 hyperparameters used in training.
PARAMETER
Learning rate
Episode length
Batch size
Discount (c)
Clipping parameter
Optimizer
VALUE
3e-4
20 steps
128 steps
0.99
0.2
Adam
Actor Critic (A2C) [19] and Trust Region Policy Optimization
(TRPO) [20]. These are the only other
algorithms within Stable Baselines that are
compatible with the MultiDiscrete action
space, so testing other algorithms would
require modifying the tumble bot action
space. In Table 2, we see that A2C achieved
a large asymptotic reward, which was
similar to PPO-LSTM. On the other hand,
TRPO did not perform that well, followed
by PPO-MLP. Each algorithm was able to
train a successful policy, which was expected
due to the small action space, but the
higher reward values indicate a policy that
can more quickly and accurately reach the
target. Since each RL algorithm we tested
performed equally well or worse than PPOLSTM,
we chose to apply this method for
our task; in general, it is also the most stable
and likely to converge [12].
PPO allows discrete and continuous action and observation
spaces. A discrete action space was used, with three
possible outputs for each of the two legs. These outputs are
velocities of -5.25 rad/s, 0 rad/s, and 5.25 rad/s. This small
action space allows for a simple approximation of the realworld
servos in simulation. Additionally, limiting the number
of actions greatly limits the dimensionality of the policy
optimization. Adding the full range of servo velocities would
exponentially increase the amount of exploration needed
during training and would not add significant physical
capability to the robot agent.
The observation space is continuous and includes the position
of the center of mass, orientation of the robot torso, and
the intended output velocities. The center of mass velocity is
included in the reward calculation but not in the observation.
This way, after transfer, the learned policy can still be loaded,
and the velocity data do not need to be estimated online.
Decreasing the dimension of the observation space was found
to improve transfer success in [8].
Adding noise to the output velocities, as described in
the " Simulation Environment " section, prevents the policy
TABLE 2. The asymptotic reward of different
RL methods.
ALGORITHM
PPO-LSTM
PPO-MLP
A2C
TRPO
REWARD
9.342
7.758
9.333
8.089
A2C: Advantage Actor Critic; TRPO: Trust Region Policy Optimization.
90 IEEE ROBOTICS & AUTOMATION MAGAZINE JUNE 2023
IEEE Robotics & Automation Magazine - June 2023
Table of Contents for the Digital Edition of IEEE Robotics & Automation Magazine - June 2023
Contents
IEEE Robotics & Automation Magazine - June 2023 - Cover1
IEEE Robotics & Automation Magazine - June 2023 - Cover2
IEEE Robotics & Automation Magazine - June 2023 - Contents
IEEE Robotics & Automation Magazine - June 2023 - 2
IEEE Robotics & Automation Magazine - June 2023 - 3
IEEE Robotics & Automation Magazine - June 2023 - 4
IEEE Robotics & Automation Magazine - June 2023 - 5
IEEE Robotics & Automation Magazine - June 2023 - 6
IEEE Robotics & Automation Magazine - June 2023 - 7
IEEE Robotics & Automation Magazine - June 2023 - 8
IEEE Robotics & Automation Magazine - June 2023 - 9
IEEE Robotics & Automation Magazine - June 2023 - 10
IEEE Robotics & Automation Magazine - June 2023 - 11
IEEE Robotics & Automation Magazine - June 2023 - 12
IEEE Robotics & Automation Magazine - June 2023 - 13
IEEE Robotics & Automation Magazine - June 2023 - 14
IEEE Robotics & Automation Magazine - June 2023 - 15
IEEE Robotics & Automation Magazine - June 2023 - 16
IEEE Robotics & Automation Magazine - June 2023 - 17
IEEE Robotics & Automation Magazine - June 2023 - 18
IEEE Robotics & Automation Magazine - June 2023 - 19
IEEE Robotics & Automation Magazine - June 2023 - 20
IEEE Robotics & Automation Magazine - June 2023 - 21
IEEE Robotics & Automation Magazine - June 2023 - 22
IEEE Robotics & Automation Magazine - June 2023 - 23
IEEE Robotics & Automation Magazine - June 2023 - 24
IEEE Robotics & Automation Magazine - June 2023 - 25
IEEE Robotics & Automation Magazine - June 2023 - 26
IEEE Robotics & Automation Magazine - June 2023 - 27
IEEE Robotics & Automation Magazine - June 2023 - 28
IEEE Robotics & Automation Magazine - June 2023 - 29
IEEE Robotics & Automation Magazine - June 2023 - 30
IEEE Robotics & Automation Magazine - June 2023 - 31
IEEE Robotics & Automation Magazine - June 2023 - 32
IEEE Robotics & Automation Magazine - June 2023 - 33
IEEE Robotics & Automation Magazine - June 2023 - 34
IEEE Robotics & Automation Magazine - June 2023 - 35
IEEE Robotics & Automation Magazine - June 2023 - 36
IEEE Robotics & Automation Magazine - June 2023 - 37
IEEE Robotics & Automation Magazine - June 2023 - 38
IEEE Robotics & Automation Magazine - June 2023 - 39
IEEE Robotics & Automation Magazine - June 2023 - 40
IEEE Robotics & Automation Magazine - June 2023 - 41
IEEE Robotics & Automation Magazine - June 2023 - 42
IEEE Robotics & Automation Magazine - June 2023 - 43
IEEE Robotics & Automation Magazine - June 2023 - 44
IEEE Robotics & Automation Magazine - June 2023 - 45
IEEE Robotics & Automation Magazine - June 2023 - 46
IEEE Robotics & Automation Magazine - June 2023 - 47
IEEE Robotics & Automation Magazine - June 2023 - 48
IEEE Robotics & Automation Magazine - June 2023 - 49
IEEE Robotics & Automation Magazine - June 2023 - 50
IEEE Robotics & Automation Magazine - June 2023 - 51
IEEE Robotics & Automation Magazine - June 2023 - 52
IEEE Robotics & Automation Magazine - June 2023 - 53
IEEE Robotics & Automation Magazine - June 2023 - 54
IEEE Robotics & Automation Magazine - June 2023 - 55
IEEE Robotics & Automation Magazine - June 2023 - 56
IEEE Robotics & Automation Magazine - June 2023 - 57
IEEE Robotics & Automation Magazine - June 2023 - 58
IEEE Robotics & Automation Magazine - June 2023 - 59
IEEE Robotics & Automation Magazine - June 2023 - 60
IEEE Robotics & Automation Magazine - June 2023 - 61
IEEE Robotics & Automation Magazine - June 2023 - 62
IEEE Robotics & Automation Magazine - June 2023 - 63
IEEE Robotics & Automation Magazine - June 2023 - 64
IEEE Robotics & Automation Magazine - June 2023 - 65
IEEE Robotics & Automation Magazine - June 2023 - 66
IEEE Robotics & Automation Magazine - June 2023 - 67
IEEE Robotics & Automation Magazine - June 2023 - 68
IEEE Robotics & Automation Magazine - June 2023 - 69
IEEE Robotics & Automation Magazine - June 2023 - 70
IEEE Robotics & Automation Magazine - June 2023 - 71
IEEE Robotics & Automation Magazine - June 2023 - 72
IEEE Robotics & Automation Magazine - June 2023 - 73
IEEE Robotics & Automation Magazine - June 2023 - 74
IEEE Robotics & Automation Magazine - June 2023 - 75
IEEE Robotics & Automation Magazine - June 2023 - 76
IEEE Robotics & Automation Magazine - June 2023 - 77
IEEE Robotics & Automation Magazine - June 2023 - 78
IEEE Robotics & Automation Magazine - June 2023 - 79
IEEE Robotics & Automation Magazine - June 2023 - 80
IEEE Robotics & Automation Magazine - June 2023 - 81
IEEE Robotics & Automation Magazine - June 2023 - 82
IEEE Robotics & Automation Magazine - June 2023 - 83
IEEE Robotics & Automation Magazine - June 2023 - 84
IEEE Robotics & Automation Magazine - June 2023 - 85
IEEE Robotics & Automation Magazine - June 2023 - 86
IEEE Robotics & Automation Magazine - June 2023 - 87
IEEE Robotics & Automation Magazine - June 2023 - 88
IEEE Robotics & Automation Magazine - June 2023 - 89
IEEE Robotics & Automation Magazine - June 2023 - 90
IEEE Robotics & Automation Magazine - June 2023 - 91
IEEE Robotics & Automation Magazine - June 2023 - 92
IEEE Robotics & Automation Magazine - June 2023 - 93
IEEE Robotics & Automation Magazine - June 2023 - 94
IEEE Robotics & Automation Magazine - June 2023 - 95
IEEE Robotics & Automation Magazine - June 2023 - 96
IEEE Robotics & Automation Magazine - June 2023 - 97
IEEE Robotics & Automation Magazine - June 2023 - 98
IEEE Robotics & Automation Magazine - June 2023 - 99
IEEE Robotics & Automation Magazine - June 2023 - 100
IEEE Robotics & Automation Magazine - June 2023 - 101
IEEE Robotics & Automation Magazine - June 2023 - 102
IEEE Robotics & Automation Magazine - June 2023 - 103
IEEE Robotics & Automation Magazine - June 2023 - 104
IEEE Robotics & Automation Magazine - June 2023 - 105
IEEE Robotics & Automation Magazine - June 2023 - 106
IEEE Robotics & Automation Magazine - June 2023 - 107
IEEE Robotics & Automation Magazine - June 2023 - 108
IEEE Robotics & Automation Magazine - June 2023 - 109
IEEE Robotics & Automation Magazine - June 2023 - 110
IEEE Robotics & Automation Magazine - June 2023 - 111
IEEE Robotics & Automation Magazine - June 2023 - 112
IEEE Robotics & Automation Magazine - June 2023 - 113
IEEE Robotics & Automation Magazine - June 2023 - 114
IEEE Robotics & Automation Magazine - June 2023 - 115
IEEE Robotics & Automation Magazine - June 2023 - 116
IEEE Robotics & Automation Magazine - June 2023 - 117
IEEE Robotics & Automation Magazine - June 2023 - 118
IEEE Robotics & Automation Magazine - June 2023 - 119
IEEE Robotics & Automation Magazine - June 2023 - 120
IEEE Robotics & Automation Magazine - June 2023 - 121
IEEE Robotics & Automation Magazine - June 2023 - 122
IEEE Robotics & Automation Magazine - June 2023 - 123
IEEE Robotics & Automation Magazine - June 2023 - 124
IEEE Robotics & Automation Magazine - June 2023 - 125
IEEE Robotics & Automation Magazine - June 2023 - 126
IEEE Robotics & Automation Magazine - June 2023 - 127
IEEE Robotics & Automation Magazine - June 2023 - 128
IEEE Robotics & Automation Magazine - June 2023 - 129
IEEE Robotics & Automation Magazine - June 2023 - 130
IEEE Robotics & Automation Magazine - June 2023 - 131
IEEE Robotics & Automation Magazine - June 2023 - 132
IEEE Robotics & Automation Magazine - June 2023 - 133
IEEE Robotics & Automation Magazine - June 2023 - 134
IEEE Robotics & Automation Magazine - June 2023 - 135
IEEE Robotics & Automation Magazine - June 2023 - 136
IEEE Robotics & Automation Magazine - June 2023 - Cover3
IEEE Robotics & Automation Magazine - June 2023 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2023
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2023
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2023
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2023
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2022
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2022
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2022
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2022
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2021
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2021
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2021
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2021
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2020
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2020
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2020
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2020
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2019
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2019
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2019
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2019
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2018
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2018
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2018
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2018
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2017
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2017
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2017
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2017
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2016
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2016
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2016
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2016
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2015
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2015
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2015
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2015
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2014
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2014
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2014
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2014
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2013
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2013
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2013
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2013
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2012
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2012
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2012
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2012
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2011
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2011
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2011
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2011
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2010
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2010
https://www.nxtbookmedia.com