…We now describe our own work in this area, conducted in 1999 and early 2000. Our project was in the first batch of a recent spate of studies to use MathEngine’s commercially available physics engine [apparently now named Vortex], a version of which (SDK 1.1) is available free for academic use.20 The system was basically a re-implementation of that written by Karl Sims in 1994.
…We used a number of different fitness functions for scoring the success of each creature in its environment, but they all basically rewarded creatures for movement. The definition of the fitness function in fact turned out to be surprisingly difficult to get right, even when we just wanted to reward creatures for moving forward. A straightforward function that simply measured the distance moved by the creature’s center of mass over the period of evaluation had a tendency to select for creatures that (in the fluid environment) produced an initial thrust to move away from their starting position but showed no further movement and soon slowed to a halt. Such creatures would have high fitness relative to most of the randomly generated creatures in the early generations and would therefore be selected. However, it is clear that their fitness could be improved if they repeated the thrust movement to swim further and faster. Unfortunately it appeared that in many cases where these “one push” creatures were selected in the initial generations, the population reached an evolutionary impasse (a local optimum in the fitness landscape) and had no easy mutational routes to higher fitness…Additionally, if the distance moved by the creature was being measured at various time slices throughout the evaluation period (so that these various distances can be weighted and summed to give a final fitness score), we needed to decide whether to score distance moved in any direction at any one time slice equally (in which case there was no pressure to evolve creatures that swam in a straight line over the whole evaluation period), or whether to reward only distance moved in one particular direction (and if so, in which direction). It was not difficult to make pragmatic decisions about such choices, but the point is that the choice of fitness function even for seemingly straightforward behaviors is not trivial and usually requires considerable experimentation to get right. The function that successfully produces the desired behaviors can often be somewhat more complicated than might initially have been thought.
Even the method used to measure the position of a creature at a given instant was not straightforward. In most runs we used the center of mass. However, in some runs creatures evolved that would initially adopt a compact, folded configuration, then as the evaluation period proceeded they would “unfold” in a particular direction. This unfolding had the effect of shifting the creature’s center of mass, thereby increasing its fitness. Again, if this trick was selected in the early generations of a run, it was sometimes difficult for the population to jump out of this local fitness optimum and find continuous movements that would generate higher fitness scores. We experimented with various other ways of measuring distance moved, such as using the distance moved by the body part that had moved least over the duration of the evaluation. The general problem is, no matter what fitness function is used, there often seems to be a way for creatures to score highly on it while not performing the sort of behavior that we, as designers of the function, had hoped for. This problem is not insurmountable; with a more careful specification of the function all “undesired” behaviors could presumably be detected and given low fitness scores. However, this need for careful design of very specific, detailed fitness functions runs counter to one of our goals of implementing the system, namely, to use it as a method of automatically generating creatures given only a high level specification of the required behavior. Nevertheless, while the use of very specific fitness functions can certainly increase the chances of evolving the desired behaviors in any given run, even using straightforward fitness functions will sometimes produce the desired results (as will be demonstrated in the rest of this section), so our goal was at least partially fulfilled.
…A number of checks were also made to overcome limitations in the simulation software. Despite various attempts to limit the magnitude of the forces applied to joints, creatures would still sometimes evolve whose movements entailed forces and velocities that were too great for the physics engine to resolve at the given size of the integration step. In these cases, the physics engine tended to accumulate numerical errors to a point where the creature irrecoverably exploded (ie. the constraint solver failed to converge on a solution, and the integrator then generated incorrect velocities, giving the impression that the body parts had blown apart in random directions)…The MathEngine SDK does generate some runtime warnings that indicate that this kind of situation is imminent. We kept a tally of the number of such warnings that each creature generated and aborted the simulation of any creature that had generated more than a certain threshold number of them. We also checked whether a creature had actually exploded throughout its evaluation (by checking for high velocities, etc.) and immediately aborted any that had.
Note that we were using MathEngine’s SDK 1.1 for this work; subsequent experience with using their latest offering (the Dynamics Toolkit 2.0 alpha release) suggests that the software is now much more stable. However, our more recent experiences with using both MathEngine and other physics engines (eg. Havok)12 for this sort of work suggest that they all have some weaknesses in stability of simulation in certain situations. Unfortunately, it is in the nature of evolutionary algorithms that such weaknesses will almost inevitably be encountered. A recent review article has tested the stability of the MathEngine, Havok, and Ipion engines [since acquired by Havok] in a variety of situations.16, 17 Although these products are improving, the current situation is that, no matter which physics engine is used, it is likely that a certain number of stability checks of the type just described will be required in any evolutionary system of this kind.