IEEE Computational Intelligence Magazine - May 2021
a 0 = 45) to the target than the other (optiTransfer neuroevolution is applied to solve the
mized search distr ibution for g = 3.7,
a 0 = 45). By default, the mixture model for
projectiles on another planet such as Mars or on the
transfer neuroevolution is initialized with
Moon, since the same physics laws apply elsewhere
equal mixing coefficients for the target and all
in the universe.
source distribution components. As shown in
the left panel of Figure 9a, the mixing coefficient of the beneficial source quickly ascends
adaptive transfer mechanism to systematically transfer useful
to 1 at the early evolutionary stage, while suppressing the mixknowledge from relevant source problems, while protecting the
ing coefficients from the less relevant source. This shows that
search against negative transfer. In practice, it is very common for
the mixture model-based transfer is capable of jointly processmany similar differential equation problems to occur under difing multiple sources and selecting the one that is most relevant
ferent environments and boundary conditions; as such, we expect
to the target. After a certain point however, the source no lontransfer neuroevolution to shine in this domain. We empirically
ger remains competitive with the evolved target search distridemonstrated that transfer neuroevolution may not only improve
bution, which causes its mixing coefficient to gradually drop.
the convergence speed but also improve the solution accuracy in
Eventually, the target search distribution takes over the mixture
comparison to the baseline neuroevolution algorithm.
model and all the source components are deactivated. Due to
In our experimental study, neuroevolution nevertheless
the directional bias induced by the source priors, the search
starts losing its competitive advantage to SGD (ADAM) for
progresses at a faster rate towards a better solution.
more complex differential equations. On this account, it is
Next, the sensitivity of initial mixing coefficient a s' is invesimportant for future research to further improve the effectivetigated under the single source setup. We consider the scenario
ness of neuroevolution in terms of: (1) handling much larger
where the single source is from a related prior (optimized
neural networks, such that the solution of complex differential
search distribution for g = 3.7, a 0 = 45), and is initialized at
equations can be accurately emulated; and (2) simultaneously
different levels of a { = 0.99, 0.1, 0. As shown in the right panel
searching for the best network architecture in addition to the
of Figure 9a, a { quickly ascends to dominate the mixture
neural network weights [58]-[61]. Another interesting idea is
model at the early evolutionary stage, even when it has a small
the development of multi-objective, multi-task neuroevolution
initial value. That is, the proposed mixture model-based adap[62], [63] in the context of physics-informed neural networks.
tive transfer method is not very sensitive to the initial a s' setWe expect these approaches to have a good synergy with the
ting. The only exception is when the initial a { is set to 0, since
recently proposed domain decomposition strategy [64], [65],
in this case pseudo-offspring will never be drawn from the
where separate neural networks are used to approximate differsource distribution. Thus, transfer never occurs. On the other
ent sub-domains of a differential equation.
hand, if the single source forms an unrelated prior (far from the
optimum target distribution), its mixing coefficient quickly
descends to 0 so that negative transfer is avoided. As a result, the
search has a similar convergence trend as the no-transfer sceJian Cheng Wong is supported by the Institute of High Perfornario (right panel of Figure 9b).
mance Computing (IHPC), A*STAR. Abhishek Gupta is supported in part by the A*STAR Cyber-Physical Production
System (CPPS) - Towards Contextual and Intelligent Response
VI. Conclusions
Research Program, under the RIE2020 IAF-PP Grant
In conclusion, this paper demonstrated neuroevolution as a
A19C1a0018.Yew-Soon Ong is supported by the Data Science
notable approach for solving differential equations, where the
and Artificial Intelligence Research Center (DSAIR) and the
problem has been transformed to one of global optimization of
School of Computer Science and Engineering at Nanyang
physics-informed neural networks. In such problems, we merit
Technological University.
the accuracy of the solution, which shall be produced by a better optimized neural network. Gradient descent methods such
as SGD may not always be the best approach, because they are
