IEEE Robotics & Automation Magazine - June 2023 - 62

post-Transformer MLP, and output distribution
layer. The critic network contains
only fully connected structures. The actor
and critic networks share the mutual architecture
for their observation networks, but
their parameters are not shared. The output
layer uses the exponential linear unit
(LU) activation function, and the other
layers use the rectified LU (ReLU) activation
function [15].
A Transformer that has achieved
"
IN THIS ARTICLE, MULTIHEAD
SELF-ATTENTION
MODULES ARE APPLIED
FOR EVERY TIME STEP.
„
breakthrough results in natural language
processing, image processing, and other fields can take full
advantage of deep learning, learning key features between data
and inferring in previously unseen environments [16]. In this
article, multihead self-attention modules are applied for every
time step. At this stage, we use the PPO algorithm, a state-ofthe-art
policy gradient DRL algorithm, to train the agent in the
simulation environment. In the process of environmental interaction,
the agent continuously updates the network parameters
to improve performance via
k =argmaxsa+, Er Ls a [( ,, k,)]
iii
+1
i
ik
(1)
where ki and k 1i + are new and old network parameters. The
loss function ()L i is shown in the following equation:
LE min ,, ,
where Ett
^hh ^ ^ h
CLIPii if f=- +
t
tt ttrA rAclip 11 h h@ (2)
6
^ ^
t
t t
is the empirical expectation over the time steps. rt is
the ratio of the probability under the current and updated policies.
f is a hyperparameter that limits the magnitude of parameter
updates. To simplify computation and converge faster, in
practice, the loss function simplifies to a simplified version
Ls a ii =minc
k
h
where
gA,e =)
^
h
^
^
1
1
+
-
e
e
h
h
A
A
A
A
$
1 0
.
(4)
Table 2 shows the RL settings, such as the PPO-clip hyperparameter
and task reward weights.
SIM-TO-REAL ADAPTATION BY IL
Due to the discrepancy between simulated and real environments,
agents trained in simulated environments often perform
poorly in the real world even if the input data structure is the
same. Since the acquisition of experimental data in the real
world will cause wear and tear of the robotic arm and waste a
lot of time, sim-to-real transfer has become the mainstream
method. If the agent is directly trained in a real environment,
the neural network needs to randomly initialize the parameters.
But the initial agent is so weak that it is difficult to obtain
an initial reward in the environment, especially in sparse
62 IEEE ROBOTICS & AUTOMATION MAGAZINE JUNE 2023
^ ,, ,, , ,,^e
i ^
r
r
i rrh kk
ii^
^
k
as
as
;
;
h
As ag As a
^ hhhm
(3)
reward and complex robot control tasks.
A supervised behavioral clone method
using trajectories in the simulated datasets
to train the agent was attempted, but this
suffers from a covariate shift, in which
case the input states are distributed differently
under the two datasets and cause
failure. Therefore, a high-efficiency solution
is to obtain a pretrained neural network
with the simulation of regular
properties through IL to make full use of
environmental space exploration.
In previous work, the method of obtaining a policy from
the simulation data generated by experts is usually IRL, in
which the cost functions of the expert data are first learned,
and then, various DRL methods are used to learn the expert
strategy. However, IRL is usually inefficient because even if
the cost function is well learned, the DRL approach still needs
to be used to train the policy. In this article, a modified GAIL
algorithm is proposed to learn policies directly from expert
data [17]. The performances of these algorithms are compared
in the " Experiments " section.
In the original IRL process, the goal is to find a cost function
that makes the expert outperform all other policies whose
cost function is regularized by }
IRL () `jargmax cH cs a
min
}
r} r=- +- +
-
E
c RSA
E
!
#
E [( ,)]
r cs a
r sa =
c
cr;=
()
r!P
()
E [( ,)]
r
(5)
where !r P is a policy whose probability of occurrence of
state action pair is measured as
t P(),sst
tr ; R3
expressed as Ec
[(sa , )]
(, )( )as t 0
=
and the discount expectation can be
rr=Rsa, t sa sa
} ^ h
GA c ='
E 6 ^^ ,
+3
E
r gcsahh@
if 1c 0
otherwise
.
(6)
Sequentially, a generative model G and a discriminative
classifier D are synchronously trained. The goal of the discriminative
model is to determine as accurately as possible
whether a sample is from real data or generated by a generative
network. The goal of the generative network is to make the discriminative
network as indistinguishable as possible from the
source of the samples. The two networks with opposite goals
are continuously trained alternately. When it finally converges,
if the discriminant network can no longer determine the source
of a sample, it is equivalent to the generation network that can
generate samples that conform to the real data distribution. In
this article, the simulation occupancy measure str
corresponds
to the raw data distribution, and the real-world occupancy measure
t
rr
corresponds to the data distribution, which is generated
by G. A min-max objective function like the GA network
is selected in this work.
i } EE 1^^ ^hhh
min ,, .
maxlogDsaD sa
r} r}i
+log
E66@@
(7)
(, )( ,). Different
} ()c values represent different IL algorithms. In this article,
} ()c is defined as

IEEE Robotics & Automation Magazine - June 2023

Table of Contents for the Digital Edition of IEEE Robotics & Automation Magazine - June 2023

Contents
IEEE Robotics & Automation Magazine - June 2023 - Cover1
IEEE Robotics & Automation Magazine - June 2023 - Cover2
IEEE Robotics & Automation Magazine - June 2023 - Contents
IEEE Robotics & Automation Magazine - June 2023 - 2
IEEE Robotics & Automation Magazine - June 2023 - 3
IEEE Robotics & Automation Magazine - June 2023 - 4
IEEE Robotics & Automation Magazine - June 2023 - 5
IEEE Robotics & Automation Magazine - June 2023 - 6
IEEE Robotics & Automation Magazine - June 2023 - 7
IEEE Robotics & Automation Magazine - June 2023 - 8
IEEE Robotics & Automation Magazine - June 2023 - 9
IEEE Robotics & Automation Magazine - June 2023 - 10
IEEE Robotics & Automation Magazine - June 2023 - 11
IEEE Robotics & Automation Magazine - June 2023 - 12
IEEE Robotics & Automation Magazine - June 2023 - 13
IEEE Robotics & Automation Magazine - June 2023 - 14
IEEE Robotics & Automation Magazine - June 2023 - 15
IEEE Robotics & Automation Magazine - June 2023 - 16
IEEE Robotics & Automation Magazine - June 2023 - 17
IEEE Robotics & Automation Magazine - June 2023 - 18
IEEE Robotics & Automation Magazine - June 2023 - 19
IEEE Robotics & Automation Magazine - June 2023 - 20
IEEE Robotics & Automation Magazine - June 2023 - 21
IEEE Robotics & Automation Magazine - June 2023 - 22
IEEE Robotics & Automation Magazine - June 2023 - 23
IEEE Robotics & Automation Magazine - June 2023 - 24
IEEE Robotics & Automation Magazine - June 2023 - 25
IEEE Robotics & Automation Magazine - June 2023 - 26
IEEE Robotics & Automation Magazine - June 2023 - 27
IEEE Robotics & Automation Magazine - June 2023 - 28
IEEE Robotics & Automation Magazine - June 2023 - 29
IEEE Robotics & Automation Magazine - June 2023 - 30
IEEE Robotics & Automation Magazine - June 2023 - 31
IEEE Robotics & Automation Magazine - June 2023 - 32
IEEE Robotics & Automation Magazine - June 2023 - 33
IEEE Robotics & Automation Magazine - June 2023 - 34
IEEE Robotics & Automation Magazine - June 2023 - 35
IEEE Robotics & Automation Magazine - June 2023 - 36
IEEE Robotics & Automation Magazine - June 2023 - 37
IEEE Robotics & Automation Magazine - June 2023 - 38
IEEE Robotics & Automation Magazine - June 2023 - 39
IEEE Robotics & Automation Magazine - June 2023 - 40
IEEE Robotics & Automation Magazine - June 2023 - 41
IEEE Robotics & Automation Magazine - June 2023 - 42
IEEE Robotics & Automation Magazine - June 2023 - 43
IEEE Robotics & Automation Magazine - June 2023 - 44
IEEE Robotics & Automation Magazine - June 2023 - 45
IEEE Robotics & Automation Magazine - June 2023 - 46
IEEE Robotics & Automation Magazine - June 2023 - 47
IEEE Robotics & Automation Magazine - June 2023 - 48
IEEE Robotics & Automation Magazine - June 2023 - 49
IEEE Robotics & Automation Magazine - June 2023 - 50
IEEE Robotics & Automation Magazine - June 2023 - 51
IEEE Robotics & Automation Magazine - June 2023 - 52
IEEE Robotics & Automation Magazine - June 2023 - 53
IEEE Robotics & Automation Magazine - June 2023 - 54
IEEE Robotics & Automation Magazine - June 2023 - 55
IEEE Robotics & Automation Magazine - June 2023 - 56
IEEE Robotics & Automation Magazine - June 2023 - 57
IEEE Robotics & Automation Magazine - June 2023 - 58
IEEE Robotics & Automation Magazine - June 2023 - 59
IEEE Robotics & Automation Magazine - June 2023 - 60
IEEE Robotics & Automation Magazine - June 2023 - 61
IEEE Robotics & Automation Magazine - June 2023 - 62
IEEE Robotics & Automation Magazine - June 2023 - 63
IEEE Robotics & Automation Magazine - June 2023 - 64
IEEE Robotics & Automation Magazine - June 2023 - 65
IEEE Robotics & Automation Magazine - June 2023 - 66
IEEE Robotics & Automation Magazine - June 2023 - 67
IEEE Robotics & Automation Magazine - June 2023 - 68
IEEE Robotics & Automation Magazine - June 2023 - 69
IEEE Robotics & Automation Magazine - June 2023 - 70
IEEE Robotics & Automation Magazine - June 2023 - 71
IEEE Robotics & Automation Magazine - June 2023 - 72
IEEE Robotics & Automation Magazine - June 2023 - 73
IEEE Robotics & Automation Magazine - June 2023 - 74
IEEE Robotics & Automation Magazine - June 2023 - 75
IEEE Robotics & Automation Magazine - June 2023 - 76
IEEE Robotics & Automation Magazine - June 2023 - 77
IEEE Robotics & Automation Magazine - June 2023 - 78
IEEE Robotics & Automation Magazine - June 2023 - 79
IEEE Robotics & Automation Magazine - June 2023 - 80
IEEE Robotics & Automation Magazine - June 2023 - 81
IEEE Robotics & Automation Magazine - June 2023 - 82
IEEE Robotics & Automation Magazine - June 2023 - 83
IEEE Robotics & Automation Magazine - June 2023 - 84
IEEE Robotics & Automation Magazine - June 2023 - 85
IEEE Robotics & Automation Magazine - June 2023 - 86
IEEE Robotics & Automation Magazine - June 2023 - 87
IEEE Robotics & Automation Magazine - June 2023 - 88
IEEE Robotics & Automation Magazine - June 2023 - 89
IEEE Robotics & Automation Magazine - June 2023 - 90
IEEE Robotics & Automation Magazine - June 2023 - 91
IEEE Robotics & Automation Magazine - June 2023 - 92
IEEE Robotics & Automation Magazine - June 2023 - 93
IEEE Robotics & Automation Magazine - June 2023 - 94
IEEE Robotics & Automation Magazine - June 2023 - 95
IEEE Robotics & Automation Magazine - June 2023 - 96
IEEE Robotics & Automation Magazine - June 2023 - 97
IEEE Robotics & Automation Magazine - June 2023 - 98
IEEE Robotics & Automation Magazine - June 2023 - 99
IEEE Robotics & Automation Magazine - June 2023 - 100
IEEE Robotics & Automation Magazine - June 2023 - 101
IEEE Robotics & Automation Magazine - June 2023 - 102
IEEE Robotics & Automation Magazine - June 2023 - 103
IEEE Robotics & Automation Magazine - June 2023 - 104
IEEE Robotics & Automation Magazine - June 2023 - 105
IEEE Robotics & Automation Magazine - June 2023 - 106
IEEE Robotics & Automation Magazine - June 2023 - 107
IEEE Robotics & Automation Magazine - June 2023 - 108
IEEE Robotics & Automation Magazine - June 2023 - 109
IEEE Robotics & Automation Magazine - June 2023 - 110
IEEE Robotics & Automation Magazine - June 2023 - 111
IEEE Robotics & Automation Magazine - June 2023 - 112
IEEE Robotics & Automation Magazine - June 2023 - 113
IEEE Robotics & Automation Magazine - June 2023 - 114
IEEE Robotics & Automation Magazine - June 2023 - 115
IEEE Robotics & Automation Magazine - June 2023 - 116
IEEE Robotics & Automation Magazine - June 2023 - 117
IEEE Robotics & Automation Magazine - June 2023 - 118
IEEE Robotics & Automation Magazine - June 2023 - 119
IEEE Robotics & Automation Magazine - June 2023 - 120
IEEE Robotics & Automation Magazine - June 2023 - 121
IEEE Robotics & Automation Magazine - June 2023 - 122
IEEE Robotics & Automation Magazine - June 2023 - 123
IEEE Robotics & Automation Magazine - June 2023 - 124
IEEE Robotics & Automation Magazine - June 2023 - 125
IEEE Robotics & Automation Magazine - June 2023 - 126
IEEE Robotics & Automation Magazine - June 2023 - 127
IEEE Robotics & Automation Magazine - June 2023 - 128
IEEE Robotics & Automation Magazine - June 2023 - 129
IEEE Robotics & Automation Magazine - June 2023 - 130
IEEE Robotics & Automation Magazine - June 2023 - 131
IEEE Robotics & Automation Magazine - June 2023 - 132
IEEE Robotics & Automation Magazine - June 2023 - 133
IEEE Robotics & Automation Magazine - June 2023 - 134
IEEE Robotics & Automation Magazine - June 2023 - 135
IEEE Robotics & Automation Magazine - June 2023 - 136
IEEE Robotics & Automation Magazine - June 2023 - Cover3
IEEE Robotics & Automation Magazine - June 2023 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2023
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2023
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2023
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2023
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2022
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2022
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2022
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2022
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2021
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2021
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2021
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2021
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2020
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2020
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2020
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2020
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2019
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2019
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2019
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2019
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2018
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2018
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2018
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2018
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2017
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2017
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2017
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2017
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2016
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2016
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2016
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2016
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2015
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2015
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2015
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2015
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2014
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2014
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2014
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2014
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2013
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2013
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2013
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2013
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2012
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2012
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2012
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2012
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2011
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2011
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_june2011
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_march2011
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_december2010
https://www.nxtbook.com/nxtbooks/ieee/roboticsautomation_september2010
https://www.nxtbookmedia.com