similar set point errors in the physical trials, which suggests that physical limitations of the tumble bot (such as step climbing ability) may have a greater role in maneuvering through terrain than more robust policies. Some additional insights can be derived from the plots in Figures 8 and 9. It appears as though the beginnings of trajectories for the uneven network are significantly more consistent than those of the flat network trajectories. Despite this difference, the flat network performed much better, on average. One interpretation is that the uneven network learned a policy that is more robust to noisy data despite converging to a suboptimal set point error. For evidence of this, consider the trials