The following is a summary of Warren Ewens arguments regarding the cost of natural selection from his book "Mathematical Population Genetics" (Ewens 1979). I have made a strong effort to summarize Ewens' work here, and while I hope to improve this page in the future, you can currently get a far clearer explanation for his work in his original papers. These are fully referenced in the bibliography.

Ewens summarized his arguments regarding the cost of natural selection
in his book "Mathematical Population Genetics." He first addresses the
substitution load (which he, along with Kimura, identifies as being *sometimes
called the "evolutionary load" or the "cost of natural selection" *(Ewens,
1979, pg. 68) in section 2.10 *Genetic Loads*). He shows that when
h = 0.5 (the coefficient of dominance), if the starting frequency of an
allele A_{1} having fitness coefficient s is x, the mean fitness
of the population will be 1 + sx and *l* (the load or cost contribution
of selection for a single generation will be approximately s(1 - x). Hence,
the load for the entire substitution process (for a single substitution)
will be L = Ss(1 - x) for each generation over
the course of the substitution. Note that this is identical to Haldane's
formula for the cost of natural selection. He notes that the summation
is approximated by *ò*_{t1}^{t2}
s(1 - x)dt, which is the same as *ò*_{x1}^{x2}
s(1 - x)dx which is 2*ò*_{x1}^{x2}
x^{-1}dx = 2log(x_{2}/x_{1}). Ewens notes that
this differs only trivially from -2log(x_{1}) (at least for situations
where x_{1 }is near 0 and x_{2} is near 1). He follows
Haldane and Kimura in using starting and ending frequencies of 0.0001 and
0.9999 for the substituting allele to come up with a cost of 18.4 when
h is 0.5 and notes that the load is generally higher when the coefficient
of dominance is not equal to 0.5. Following Haldane, Ewens uses a "typical"
value of 30 for the substitution load / cost.

Ewens then describes the meaning of the load as follows:

What does this calculation really mean? Suppose all selection is through viability differences and the number of reproducing adults in each generation remains constant atEwens then considers a series of such substitutions at different loci but with the same fitness parameters, each substitution starting regularlyN. A considerable portion of the depletion in population numbers between birth and the age of reproduction is non-genetic. Taking only the genetic component and supposing there is no depletion through genic deaths of the optimal genotype A_{1}A_{1}, a straightforward calculation shows that when the frequency of A_{1}isxthere must beN(1 + s)/(1 + sx)individuals at birth so that after differential variabilities operate, there areNindividuals at maturity. Thus the average individual is required leave approximately1 + s(1 - x)offspring after non-genetic deaths are taken account of, so that there will beNs(1 - x)"genetic deaths" in each generation associated with the evolutionary process. Summed over the entire process this givesNLindividuals in all or an average ofNL/Teach generation if the substitutional process takesTgenerations.

Ewens then mentions that some have argued that selection through fertility
differences may escape this load or cost problem. However, he shows that
if one looks at the offspring requirements of the most fit individual required
to drive a series of substitutions as described above, a similar argument
can be made for selection driven by fertility differences as was just made
for substitutions driven by viability differences. The offspring requirement
of each individual of the most fit genotype (ie the individuals that have
only the fitter gene at each locus of all of the ongoing substitutions)
will be 1 + s(1 - x) for each locus currently undergoing substitution.
Thus, the most fit individual will be required to produce 1 + L/T offspring
for each locus currently undergoing substitution. Using a simple multiplicative
model of fitness, this indicates that if T/n substitutions are going on
simultaneously, individuals with the most fit genotype will be required
to produce *(1 + L/T) ^{T/n}* offspring in all each generation.
This is approximately

In section 9.2 - *Arguments Leading to the Neutral Theory: Loads*,
Ewens gives an example of the offspring requirements required for a series
of substitutions as described above. He recaps Kimura's (then - in 1979)
recent estimate of the substitution rate as six substitutions per generation,
which puts *n = 1/6. *Plugging this value into the Load equation (
*exp(30/n)*)
gives an offspring requirement of *exp(30/(1/6)) = exp(180) = *[approximately]
10^{78} offspring. Ewens then quotes Kimura to show agreement on
this point:

"Ewens mentions that this huge offspring requirement only applies to the parents of the "most fit genotype" and is does not apply to the average individual. He refers this to his derivation of the offspring requirement (to carry out mutant substitution at the above rate, each parent must leave e^{180}10^{78}offspring for only one of the offspring to survive. This was the main reason why random fixation of selectively neutral mutants was first proposed by one of us as the main factor in molecular evolution."

First, he assumes a sequence of loci that are substituting because of
selective differences at each locus with h (the coefficient of dominance)
= 1/2 and a selection coefficient of s. The contribution of a single locus
undergoing substitution to the average fitness (w_{avg}) of the
population is expected to be 1 + sx. (Click here for
proof.) Considering multiple loci and multiplicative fitnesses, w_{avg
}=
P_{i}(1
+ sx_{i}) , that is the average fitness will be the product of
1 + sx_{i} where x_{i} is the frequency of the i^{th}
locus undergoing substitution. If there are J loci undergoing substitution
at any one time, the average fitness will be approximated by w_{avg
}=
(1 + (1/2)s)^{J}. If each substitution takes T generations and
there are n substitutions starting per generation, then J = T/n and
w_{avg }= (1 + (1/2)s)^{T/n} = exp((1/2)sT/n). The fitness
of the individual having the optimal genotype (homozygous for each of the
favorable alleles undergoing substitution) will be given by w_{max}
= (1 + s)^{T/n}, which is approximately equal to exp(sT/n).
If the fitnesses are rescaled so that w_{avg }= 1, then the
fitness requirement for the optimal genotype will be exp((1/2)sT/n). To
determine T (so that we will know how many generations are required for
a substitution), Ewens uses the usual starting and ending values for favorable
gene frequencies (0.0001 and 0.9999 respectively) and the formula T = *ò*_{x1}^{x2}
{sx(1 - x)}^{-1}dx where x_{1} = 0.0001 and x_{2}
= 0.9999. This yields T = 36.8/s, meaning that a substitution under these
conditions where s = 0.01 will require around 3,680 generations. Plugging
this value back into the equation for the offspring requirement of the
optimal phenotype (Ewens refers to this as l, the substitution load -
l = exp((1/2)sT/n)) gives l = exp((1/2)* 0.01*3680/n)) = exp(18.4/n).
Using the substitution rate estimated by Kimura of 6 substitutions per
generation puts the offspring requirement of the most fit genotype at exp(18.4/(1/6))
= exp(110.4) = 9 X 10^{47}, a ridiculous number of offspring for
any living creature. Furthermore, using the "representative value" of 30
for the substitution cost (to account for increases to the cost due to
dominance effects) recovered Kimura's estimate of exp(30/(1/6)) = exp(180)
= 1 X 10^{78}, another impossible offspring requirement.

After a qualitative discussion of some factors (i.e. frequency dependancy
and non-multiplicative epistasis among the various substituting loci) that
can be expected to reduce the substitution load and hence the offspring
requirement of the optimal genotype, Ewens moves on to the most critical
factor that reduces the substitution load. Ewens notes that if the parameter
values for a series of substitutions are taken as having an initial frequency
of 0.0001, a final frequency of 0.9999, a coefficient of dominance (*h) *
of 0.5, and an selection coefficient (*s*) of 0.01, if 6 substitutions
start each generation (as suggested by Kimura) leading to n = 1/6; then
there will be 22,080 substitutions going on at any given time. That means
that there will be 22,080 genes in the process of going from a frequency
of very nearly 0 in the population to fixation. Many of these genes, having
begun the substitution process relatively recently will have quite low
frequencies in the population, making individuals carrying the optimal
genotype quite rare. Under these conditions, Ewens calculated the probability
of any one individual having the optimal genotype (i.e. having all 22,080
beneficial alleles simultaneously) as 10^{-23,200}. Needless to
say, such an individual is never going to exist in a finite population!

Ewens then addressed the problem of determining what the optimal genotype
would be that was likely to actually exist in a finite population. Using
the statistics of extreme values in a population of finite size, Ewens
shows that if the mean and variance of the number of preferred (fitter)
alleles is known for a population, the fittest genotype that will be likely
to actually exist can be determined. He refers to an earlier paper (Ewens
1970) for the derivation that the variance in preferred alleles in the
series of substitutions described above is given by *s/n*. Using s
= 0.01 and n = 1/6, the variance will be 0.06 which leads to a standard
deviation of 0.245 (recalling that the standard deviation is given by the
square root of the variance). Using the statistics of extreme values, Ewens
stated that for a population of size 10^{5} , if *s* is small
(less than 0.1), the population fitness distribution should be approximately
normal and the most fit individual in that population would be expected
to have a fitness that is no more than 4 standard deviations above the
mean. (He references Pearson and Hartley, 1958, Table 28 for this.) For
our example, the standard deviation of fitness is 0.245 which leads to
an expected optimal fitness of 1 + 4(0.245) = 1.98.

Ewens' calculations indicate that a population maintained at around
100,000 individuals is capable of driving six substitutions per generation
(the highest rate ever claimed for amino acid substitutions among a variety
of mammal lineages) with a reproductive excess of 1.98 - 1 = 0.98 offspring
per parent. Although this offspring requirement is high compared to Haldane's
claim that the intensity of natural selection rarely exceeds 0.1, it is
well within the reproductive capabilites of humans and apes where a family
size of 4 children will meet the requirement. Families having more than
four children will have "extra" offspring available to "pay the cost" of
deleterious mutations, random death, and other non-substitutional causes.
Nonetheless, despite the questionable signifigance of Haldane's limit of
10% for the selection intensity, we can easily turn the equation around
to see how many substitutions can occur without exceeding a 10% reproductive
excess to pay the substitution cost:

1 + 4(s/n)For the 500,000 generations in the combined human / chimp lines, this would allow 31,250 substitutions.^{0.5}= 1.1

4(s/n)^{0.5}= 0.1

16s/n = 0.01

n = 16s/0.01For s = 0.01,

n = 16 * 0.01/0.01

n = 1 substitution every 16 generations.

It's also worth noting that this number is dependent upon the selection coefficient. If the bulk of selection coefficients for substitutions are closer to 0.001 rather than 0.01, then Ewen's formula would allow a subtitution rate of 1 every 1.6 generations, permitting around 300,000 substitutions in the combined 500,000 generations separating chimps and humans from their common ancestor.

**Proof that w _{avg }= 1 +
sx for a Diploid Species When h = 1/2**

s Selection Coefficient

h Coefficient
of Dominance.

x Frequency of
the "favored" allele. Favored indicates that posession of the allele improves
fitness.

1 - x Frequency of the"non-favored" allele.

Individuals of diploid species have two copies of each gene. We can designate the favored allele as A and the non-favored allele as a. Therefor, if there are two alleles (versions) of this particular gene, then there are three kinds of individuals (genotypes, actually) that may exist:

AA - Has 2 copies of the favored allele. This individual would be homozygous
(has 2 copies of the same allele) for the favored allele.

Aa - Has 1 copy of the favored allele and 1 copy of the non-favored
allele. Such an individual is heterozygous - it has 2 different alleles.

aa - Has 2 copies of the non-favored allele. Individuals with
the aa genotype are said to homozygous for the non-favored allele.

Fitnesses for each of the three kinds of individuals can be calculated from which alleles they have (their genotypes). Fitness contributions are calculated from the selection coefficient (the fitness of an individual that is homozygous for the favored allele) and the coefficient of dominance (the degree of dominance for the favored allele - ranges from 0 (completely recessive) to 1.0 (fully dominant). The fitness for each genotype is given as follows:

AA - 1 + s

Aa - 1 + sh

aa - 1

The average fitness of a poulation is w_{avg} and is given by
the the sum of the fitness of each possible genotype multiplied by each
genotypes frequency in the population:

w_{avg }= (1 +s)x^{2} + 2*(1 + sh)x(1 - x) + (1
- x)^{2}
^{ } = x^{2
}+
sx^{2 }+ 2x(1 - x + sh - shx) + 1 - 2x + x^{2}

= x^{2 }+
sx^{2 }+ ~~2x~~ - 2x^{2} + 2shx - 2shx^{2}
+ 1 - ~~2x~~ + x^{2}

= ~~x~~+
sx^{2}^{2 } - ~~2x~~ + 2shx - 2shx^{2}^{2}
+ 1 + ~~x~~
[Cancelled out the x^{2}^{2} terms: x^{2 }- 2x^{2}
+ x^{2 }= 0.]

= sx^{2 }
+ 2shx - 2shx^{2} + 1

= 1 + sx^{2 }
+ 2shx(1 - x)

= 1 + sx(x + 2sh(1
-x))

If h = 1/2, then 2h(1 - x) will reduce to 1 - x:

w_{avg }= 1 + sx(x + 1 - x)

w_{avg }= 1 + sx(1)

w_{avg }= 1 + sx