Evolution strategies
From Scholarpedia
| Hans-Georg Beyer (2007), Scholarpedia, 2(8):1965. | revision #40199 [link to/cite this article] | |||||||||||||||||||
Curator: Dr. Hans-Georg Beyer, Vorarlberg University of Applied Sciences
Evolution Strategies (ESs) are a sub-class of nature-inspired direct search (and optimization) methods belonging to the class of Evolutionary Algorithms (EAs) which use mutation, recombination, and selection applied to a population of individuals containing candidate solutions in order to evolve iteratively better and better solutions. Its roots date back to the mid 1960s when P. Bienert, I. Rechenberg, and H.-P. Schwefel at the Technical University of Berlin, Germany, developed the first bionics-inspired schemes for evolving optimal shapes of minimal drag bodies in a wind tunnel using Darwin's evolution principle.
Evolution Strategies can be applied in all fields of optimization
including continuous, discrete, combinatorial search spaces
without and with constraints as well as mixed search spaces.
Given the optimization problem
the function
to be optimized, also referred
to as objective (or goal) function, can be presented in mathematical
form, via simulations, or even in terms of measurements obtained from
real objects. The ES can also be applied to a set of objective functions
in context of multi-objective optimization
(see also Multiobjective Evolutionary Algorithms and Multiobjective Search).
Contents |
The Canonical ES Versions
The canonical versions of the ES are denoted by
respectively. Here
denotes the number of parents,
the mixing number (i.e., the number of parents
involved in the procreation of an offspring), and
the number of offspring. The parents are deterministically selected
(i.e., deterministic survivor selection) from
the (multi-)set of either the offspring, referred to as comma-selection
(
must hold), or both the parents and offspring,
referred to as plus-selection.
Selection is based on the ranking of the individuals' fitness
taking the
best individuals
(also referred to as truncation selection). In general, an
comprises the object parameter vector
to be optimized,
a set of strategy parameters
, needed especially in
self-adaptive ESs, and the individual's observed fitness
being equivalent to the objective function
, i.e.,
in the simplest case.
The distinction between
and
is necessary, since
can be the result of a local
search operator which is applied to the
-function
to be optimized, or it even
can be the outcome of another ES (see Meta-ES, below).
Furthermore, the observed
can be the result of a
noisy
-evaluation process.
The conceptual algorithm of the
-ES
is given below:
-Self-Adaptation-Evolution-Strategy
- Initialize parent population
.
- Generate
offspring
forming the offspring population
where each offspring
is generated by:
- Select (randomly)
parents from
(if
take all parental individuals instead).
- Recombine the
selected parents
to form a recombinant individual
.
- Mutate the strategy parameter set
of the recombinant
.
- Mutate the objective parameter set
of the recombinant
using the mutated strategy parameter set to control the statistical properties of the object parameter mutation.
- Select (randomly)
- Select new parent population (using deterministic truncation selection) from either
- the offspring population
(this is referred to as comma-selection, usually denoted as "
-selection"), or
- the offspring
and parent
population (this is referred to as plus-selection, usually denoted as "
-selection")
- the offspring population
- Goto 2. until termination criterion fulfilled.
Depending on the search space and objective function
, the recombination and/or the mutation of the
strategy parameters may or may not occur in specific instantiations of
the algorithm. For example, a
-ES,
or equivalently
-ES, does not use recombination.
It draws its new
parents for the next generation
from both the old
parents and the
offspring (generated from these parents) by taking the best
individuals
(with respect to the observed
).
Evolution Strategies of type
are also referred
to as steady state ESs, i.e., strategies without a generation gap:
They produce only one offspring per generation. After evaluating its
fitness
, the worst individual is removed from the
population. Strategies of this type are especially useful on parallel
computers when the times for calculating the individuals' fitnesses are
non-constant, thus allowing for asynchronous parallel processing.
As to the termination condition, distance measures in the fitness and or search space as well as the maximum number of generations can be used.
Example:
-
-Self-Adaptation-ES
Mutation and recombination operators in ESs are problem-specifically
designed. As an example, consider the
-Self-Adaptation-ES
for unconstrained real-valued search spaces
.
An individual is defined as
.
The ES generates
offspring according to
representing steps 2 and 3 of the conceptual
-Self-Adaptation-Evolution-Strategy
algorithm (initialization, evolution loop, and termination condition
are not shown).
The
and
are (0, 1) normally
distributed random scalars and vectors, respectively, implementing
the mutation operation for the strategy parameter
and the
-dimensional object parameter vector
Both mutation operations are applied to the
respective recombinants
and
The mutated strategy parameter
controls the strength of the object parameter
mutation (in this example,
is simply the
standard deviation of the normally distributed random components).
This mutation is additively applied to the recombinant
Changing the mutation strength
according to (I),
allows for a self-tuning of the mutation strength:
Since the individual's
controls the generation
of the individual's
selecting a specific
individual
according to its fitness
results in an inheritance of the
corresponding
value. This process is also
referred to as
-self-adaptation. That is,
self-adaptive ESs can work without any model-based external control rule
for the endogenous strategy parameters. The rate of self-adaptation
depends on the choice of the learning parameter
Empirical as well as theoretical investigations suggest to choose
it proportional to
, e.g.,
.
Recombination in (I) is done by arithmetical averaging
over the
best offspring individuals:
Let
be a component of the individual
(either being
or a component of
)
the corresponding component of the recombinant
is calculated according to
where "
" denotes the
-th best
offspring individual (of the last generation). This type of
recombination is referred to as global intermediate
recombination and denoted by a subscript
attached to the
mixing number
. Apart from the intermediate
recombination there are other types, e.g. discrete recombination
where the parental components are coordinate-wise randomly transfered
to the recombinant.
Simple implementations of the
-
-Self-Adaptation-ES
for Mathematica and Matlab/Octave can be found
here.
Variations on ESs and Operator Design Principles
While the
-ES in Eq. (I) uses
isotropically distributed mutations for the object parameter vector
, more advanced ESs use
covariance matrix adaptation (CMA) techniques (CMA-ESs) to allow
for correlated mutations in real-valued search spaces.
Furthermore, hierarchically organized ESs (also referred to as
Meta-ESs or Nested-ESs) as well as predator-prey ESs are in use.
For example, simple Meta-ESs can be defined by
with
subpopulations of
-ESs running
independently over a number of generations
(isolation time).
Such strategies are used in mixed structure and parameter
optimization tasks and for evolutionary learning of strategy
parameters (e.g., population sizes, mutation parameters) of the
inner evolution loop.
The performance of an ES on a specific problem class depends
crucially on the design of the ES-operators (mutation, recombination,
selection) used and on the manner in which the ES-operators are
adapted during the evolution process (adaptation schemes, e.g.,
-self-adaptation, covariance matrix adaptation,
etc.). Ideally they should be designed in such a manner
that they guarantee the evolvability of the system throughout
the whole evolution process. Here are some principles and general
guide lines:
- Selection is done by population truncation similar to that what breeders are doing when breeding animals or plants.
- Typical truncation ratios
in strategies with comma-selection in continuous search spaces are in the range from 1/7 to 1/2.
- Using plus-selection (some kind of elitist selection) in conjunction with variation operators that allow for reaching any point in finite discrete search spaces in finite time guarantees stochastic convergence to the global optimizer. However, since this is a result that holds for infinite running time only, one cannot draw general conclusions regarding the finite time behavior of the ESs.
- Using plus-selection is recommended for combinatorial optimization problems.
- Evolution in ESs is usually modeled on the phenotypic level. The problem to be optimized is usually presented in its natural problem representation trying to respect the principle of strong causality. This means that the variation operators (mutation and recombination) should perform search steps in such a manner that small search steps result in small fitness changes and vice versa.
- Variation operators (mutation and recombination) should be designed on the basis of the maximum entropy principle.
- Mutation is the main source of variation, i.e., an ES always contains a mutation operator. Mutation should respect the reachability condition: Each (finite distant) point of the search space should be reachable in a finite number of steps. Mutations should be scalable (if possible) in order to allow for self-adaptation. The maximum entropy principle suggests the use of normally distributed mutations in
search spaces and geometrically distributed mutations in
(integer) search spaces.
- Recombination is applied whenever possible and useful. It uses
or more parental individuals to generate a single recombinant (the case
is referred to as multi-recombination). The main goal of recombination is the conservation of common components of the parents, i.e., the transfer of (beneficial) similarities to the next generation and to dampen the effect of malicious components of the parents' genes (genetic repair effect).
Note, it is not always possible to respect all design principles in specific applications. Violating some of these principles does not necessarily lead to poorly performing strategies.
Example: A Simple
-ES for Combinatorial Ordering Problems
Consider an optimization problem
where
is a vector describing the permutation of
components.
For example,
describes the ordering
of components, e.g., the ordering of city numbers a salesperson visits
consecutively (Traveling Salesperson Problem), or the ordering of jobs
in a job shop scheduling problem such that the overall cost
are minimal. In ESs, this optimization problem is usually represented in its
natural problem representation, i.e., the variation operators act
directly on the ordering
. An individual is defined as
.
The ES generates
offspring according to
representing steps 2 and 3 of the conceptual ES algorithm. The ES selects
randomly a parent from the set of the
best individuals from both the
parents and offspring of the last generation (indicated by the
notation). This parent is then
mutated by a random permutation. Simple permutation operators are
displayed in Figure 1.
They represent elementary move steps defining certain search neighborhoods (number of states that can be reached by one step). Unlike mutations in continuous search spaces, there exists always a minmal search step (representing the smallest mutation possible). The performance of the different permutation operators depends on the optimization problem to be solved. Trying to ensure the strong causality principle (i.e., small steps should result in small fitness changes) is one of the main design and selection principle for such operators.
Covariance Matrix Adaptation Evolution Strategies CMA-ESs
CMA-ESs represent the state-of-the-art in evolutionary optimization
in real-valued
search spaces. The CMA-ES has
been proposed by A. Gawelczyk, N. Hansen, and A. Ostermeier in the
mid 1990s. Its basic difference to the
-
-Self-Adaptation-ES
example is the shape of mutation distribution which is generated
according to a covariance matrix
which is
adapted during evolution. Thus, the mutations can adapt to the local
shape of the fitness landscape and convergence to the optimum can be
increased considerably. It uses special statistics cumulated over the
generations to control endogenous strategy-specific parameters
(the covariance matrix
and the global step
size
). This is in contrast to the (mutative)
-self-adaptation approach considered before.
A simplified (but well working) instantiation of the offspring update
formulas of the
-CMA strategy for
small population sizes
(small, compared to the
search space dimensionality
) reads
-CMA-ES
In (L1),
offspring
are generated by transforming
standard normally distributed random vectors using a transformation
matrix
to be obtained, e.g., by Cholesky
decomposition of the covariance matrix
, i.e.,
, and the
global step size factor
. In (L2) the best
mutations are recombined according to Equation (II),
thus, forming the recombinant
(center of mass
individual) for the next generation. Algorithmically, there is only
one recombinant (to be used as the final solution when the algorithm
terminates). Vector
connects
the recombinants of two consecutive generations, thus,
represents the tendency
of evolution in the search space. This information is cumulated in the
vector (
, initially
chosen) exponentially decaying with time constant
. The direction vector
is used to update (rank one update) the covariance matrix
(
, initially chosen)
with time constant
.
The remaining (L5) and (L6) are used to control the global step size
using the cumulated step size adaptation (CSA)
technique with time constant
(
, initially chosen).
The recombinant
is calculated using Equation (II).
Simple implementations of the CMA-ES for Mathematica and Matlab/Octave can be found here.
CMA-ES for large population sizes differ basically in the way how the
covariance matrix
is updated in (L4). To this
end, (L4) is extended by a rank
update proportional
to
In order to take full advantage of this update, the time constants
,
, and
must be chosen accordingly
(see Hansen et al. 2003).
References
- Beyer, H.-G. and Schwefel, H.-P. (2002). Evolution Strategies: A Comprehensive Introduction. In Natural Computing, 1(1):3-52.
- Beyer, H.-G. (2001). The Theory of Evolution Strategies. Natural Computing Series. Springer, Berlin 2001.
- Hansen, N. and Ostermeier, A. (2001). Completely Derandomized Self-Adaptation in Evolution Strategies. In Evolutionary Computation, 9(1):159-195.
- Hansen, N. and Müller, S.D. and Koumoutsakos, P. (2003). Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES). In Evolutionary Computation, 11(1):1-18.
- Rechenberg, I. (1994). Evolutionsstrategie '94. Frommann-Holzboog Verlag, Stuttgart, (in German).
- Schwefel, H.-P. (1995). Evolution and Optimum Seeking. Wiley, New York, NY.
Internal references
- Jan A. Sanders (2006) Averaging. Scholarpedia, 1(11):1760.
- Tomasz Downarowicz (2007) Entropy. Scholarpedia, 2(11):3901.
- Rob Schreiber (2007) MATLAB. Scholarpedia, 2(7):2929.
- Frank Hoppensteadt (2006) Predator-prey model. Scholarpedia, 1(10):1563.
External Links
- Hans-Georg Beyer's Website
- Evolutionary_Algorithms - Terms and Definitions
- Evolution Strategy Related Website of Bionics Chair at the Technical University Berlin
- Nikolaus Hansen's Website with CMA-ES Related Stuff
- H.-P. Schwefel's Two-Phase Nozzle Evolved by an (1+1)-ES
- Following a Moving Optimum by a (15,100)-ES
- A Collection of Evolution Strategy Demonstrations (to be found below)
- Evolution Strategies in Action: Animations Using MS Windows
See Also
Evolutionary Algorithms, Evolutionary Computation, Evolutionary Programming, Evolutionary Robotics, Evolvable Hardware, Evolving Intelligent Systems, Genetic Algorithms, Genetic Programming
| Hans-Georg Beyer (2007) Evolution strategies. Scholarpedia, 2(8):1965, (go to the first approved version) Created: 1 September 2006, reviewed: 23 August 2007, accepted: 23 August 2007 |
| Action editor: | Dr. Eugene M. Izhikevich, Editor-in-Chief of Scholarpedia, the peer-reviewed open-access encyclopedia |




